Job scheduling with Postgres: Improve database management with automation
Efficient job scheduling is essential for automating tasks and ensuring the smooth operation of a database management system. By eliminating manual efforts in routine tasks such as data backups, IT teams can enhance workflow optimization. Moreover, job scheduling aids in minimizing manual errors across a variety of use cases by enforcing data consistency and integrity.
You have options when it comes to job schedulers that interact with a Postgres database, including popular choices like pgAgent, cron jobs and RunMyJobs by Redwood. This post breaks down the key differences between these platforms in relation to Postgres.
What is a database management system?
Before diving into job scheduling with Postgres, it’s important to understand the concept of a database management system (DBMS). A DBMS is a software application that facilitates the creation, organization and management of databases. These systems provide users an interface to interact with the database so they can easily store, retrieve, modify and delete data.
PostgreSQL, also known as Postgres, is a popular and feature-rich open source DBMS. It’s known for its robustness, extensibility and compliance with SQL standards. Postgres offers a wide range of features like advanced data types, support for JSON and XML, full-text search capabilities and support for numerous programming languages.
Understanding PostgreSQL
PostgreSQL, or Postgres, is an open-source relational database management system (RDBMS). It was developed at University of California, Berkeley in the 1980s and has since grown into a powerful and widely-used system.
This database system is designed for a wide range of workloads, from small personal projects to large-scale enterprise applications. It follows the Structured Query Language (SQL) standard and offers advanced features for data management.
PostgreSQL is a relational database management system, meaning it organizes and manages data in tables with predefined schemas. It supports complex relationships between tables, enforces data integrity through constraints, and provides powerful query capabilities.
Atomicity, Consistency, Isolation, Durability (ACID) compliance is ensured through PostgreSQL. This guarantees database transactions are executed safely and reliably and protects against data corruption and loss.
How to schedule jobs with Postgres and pgAgent
One method to schedule jobs in Postgres is the pgAgent tool. pgAgent is a job scheduling agent specifically designed for Postgres databases. It allows users to schedule and automate tasks like database backups, data loading, and report generation.
pgAgent must first be installed and configured with the PostgreSQL database. After it has been installed, jobs can be created using SQL statements, stored procedures or shell scripts. The pgAgent jobs can then be scheduled to run at specific times or intervals.
The job scheduler runs as a background worker within the Postgres database, ensuring efficient execution and monitoring of scheduled tasks.
The process for scheduling jobs with Postgres and pgAgent involves several steps, including installing pgAgent on the machine where the DBMS is running. After pgAgent is installed, it needs to be connected to the Postgres database using a client like psql or pgAdmin.
The following command will create the pgAgent extension:
CREATE EXTENSION pgagent;
Next, the pgAgent database schema must be created using the following command:
CREATE SCHEMA pgagent;
To initialize the pgAgent tables, run the SQL script provided with pgAgent. This can usually be found in the share/pgagent.sql file within the PostgreSQL installation directory.
\i /path/to/share/pgagent.sql
After pgAgent has been set up successfully, new jobs can be created to schedule specific tasks. Here is an example of creating a simple job to execute an SQL statement in the CLI:
BEGIN;
SELECT pgagent.pga_jobid('MyJob') INTO my_job_id;
SELECT pgagent.pga_job(
pjobid := my_job_id,
pscheduleid := NULL,
pjobname := 'MyJob',
pjobdesc := 'My job description',
pjobhostagent := '',
pjobenabled := TRUE,
pjobhostagentislocal := TRUE,
pjobnextrun := now(),
pjobcode := 'SELECT COUNT(*) FROM my_table;',
pjobconnstr := 'host=localhost dbname=my_database user=my_user password=my_password'
);
COMMIT;
Adjust the job name, description, SQL code and database connection details in the above example to fit specific job requirements.
After creating a job, it can be scheduled to run at specified intervals using the following command:
BEGIN;
SELECT pgagent.pga_schedule(
pschedid := NULL,
pschedname := 'MyJobSchedule',
pscheddesc := 'My job schedule description',
pschedenabled := TRUE,
pschedstart := now(),
pschedend := NULL,
pschedminutes := '0,15,30,45',
pschedhours := '*',
pschedweekdays := '*',
pschedmonthdays := '*',
pschedmonths := '*',
pschedord := NULL,
pschedweek := NULL,
pschedday := NULL,
pjobid := my_job_id
);
COMMIT;
The above example shows a schedule called “MyJobSchedule” that runs every 15 minutes. Adjust the schedule name, description and time intervals to fit specific schedule requirements.
Users can monitor job execution and view job logs with pgAdmin. pgAgent provides a set of tables to store job-related metadata and lobs like pga_jobsteplog
and pga_schedule
.
How to schedule jobs with Postgres and cron jobs
Another solution for scheduling and running jobs in Postgres is using the cron job functionality available in Unix operating systems like Linux. Cron is a time-based job scheduler that allows users to automate tasks on a recurring basis. By combining the power of cron with Postgres, jobs can be scheduled that interact with the database.
Cron is a daemon, or a background process that executes non-interactive jobs. A cron file is a simple text file containing commands to run periodically at specific times. The default system cron table, or crontab, config file is /etc/crontab
.
Cron jobs are scheduled by creating a shell script or command line executable function that performs the desired database operations. This can be done using SQL statements, psql commands or other means of interacting with the database within the script. Once the script is created, cron is configured to execute the new job at specified intervals.
To schedule jobs with Postgres and cron jobs, start by configuring the PostgreSQL database with appropriate permissions and adjusting the postgresql.conf file to allow external connections. Next, install the pg_cron extension in the database. This can be done through authentication by connecting the Postgres database with a psql superuser account.
After the pg_cron
extension has been installed, jobs can be scheduled with the cron.schedule
function. This allows users to specify a cron expression to define the job schedule.
SELECT cron.schedule('* * * * *', 'SELECT my_function();');
In this example, the SELECT my_function();
statement represents the task to be executed on the specified schedule.
To remove a scheduled job, the cron.unschedule function can be implemented.
SELECT cron.unschedule(jobid);
Replace jobid
in the above example with the identifier of the job to unschedule.
Scheduling jobs with the pg_cron
expression requires superuser privileges, the user must have the necessary permissions granted.
Redwood job scheduler for PostgreSQL
For teams using Postgres for database management, RunMyJobs offers event-driven job scheduling for workload orchestration. This task scheduler drives digital transformation with adaptable automation software to streamline disconnected IT and business processes.
Users can easily schedule and run event-driven workflows and manage file transfers. Predictive and real-time SLA monitoring guarantees performance of automated tasks, and notifications can be configured to alert through SMS or email if something goes wrong.
Teams don’t have to worry about hosting, deploying or maintaining their own automation platform. The tool is designed to control servers and run scripts with lightweight job scheduling agents for Windows, Linux, macOS, Solaris and more.
The intuitive, low-code UI is extremely developer-friendly, with a drag-and-drop graphical editor and templates for building new jobs. Support for more than 25 scripting languages, including built-in syntax highlighting and parameter replacement, gives devs the freedom to code in their preferred language. Managed file transfers become much smoother with RunMyJobs, providing the ability to move, copy and manage millions of files per month across storage solutions like Amazon S3, PostgreSQL databases and more. Data workflows can be orchestrated across ERP ecosystems, including SAP and Oracle.