nth largest value from group in PostgreSQL

In this post I will show you how to use window function in PostgreSQL to find the nth largest value from every group. You can use pgAdmin tool to create the database. First open the pgAdmin tool from start menu. After that from the object browser on the left side under Servers select ‘localhost’. Right click on it and click connect. You may be prompted to enter the password, enter it and submit. Then you will see the databases list under it. Right click on the databases and select ‘New databases’. You will see a dialog box. Under properties tab enter the name of the database and press ok button. In my example I will create the database as ‘mydb’.

nth largest

nth largest

Next we will create three tables ‘project’,’milestone’ and ‘project_milestone’. Right click on the ‘Tables’ menu under mydb->schemas->public->Tables. Right click and  select ‘New Table’. Enter the name as ‘project’ Under the tables menu you will see the project table created. Under the ‘project’ table right click on the columns menu and select ‘New Column’. Enter name as ‘project_id’ and Data type as serial and press ok button.

nth largest

Next create another column with name ‘project_name’ and Data type as ‘character varying’. Now right click on the ‘project’ table and click constraints tab and add primary key.

nth largest

Similarly create table ‘milestone’ with columns milestone_id (serial) and milestone_name (character varying). Create table ‘project_milestone’ with columns id (serial), project_id (integer), milestone_id (integer), completed_date (timestamp without time zone). Column id is the primary key and Columns project_id and milestone_id in table ‘project_milestone’ are foreign keys. You can add foreign keys by right clicking on the ‘project_milestone’ table and clicking constraints  tab and add foreign key. Then from the actions tab add on update and on delete constraints.

nth largest

Here I will show you how to find the nth largest completed_date from project_milestone. Now we can run queries to insert values into the tables. First right click on the mydb from the menu and select create script. Right below query to insert values into project table.

 

insert into project(project_name)values('Project A');
insert into project(project_name)values('Project B');

Next we will insert values into milestone.

insert into milestone(milestone_name)values('Milestone 1');
insert into milestone(milestone_name)values('Milestone 2');
insert into milestone(milestone_name)values('Milestone 3');
insert into milestone(milestone_name)values('Milestone 4');
insert into milestone(milestone_name)values('Milestone 5');
insert into milestone(milestone_name)values('Milestone 6');
insert into milestone(milestone_name)values('Milestone 7');
insert into milestone(milestone_name)values('Milestone 8');
insert into milestone(milestone_name)values('Milestone 9');
insert into milestone(milestone_name)values('Milestone 10');

Next we will insert values into project_milestone.

insert into project_milestone(project_id,milestone_id,completed_date)values(1,1,'2015-01-01');
insert into project_milestone(project_id,milestone_id,completed_date)values(1,2,'2015-02-01');
insert into project_milestone(project_id,milestone_id,completed_date)values(1,3,'2015-03-01');
insert into project_milestone(project_id,milestone_id,completed_date)values(1,4,'2015-04-01');
insert into project_milestone(project_id,milestone_id,completed_date)values(1,5,'2015-05-01');
insert into project_milestone(project_id,milestone_id,completed_date)values(1,6,'2015-06-01');
insert into project_milestone(project_id,milestone_id,completed_date)values(1,7,'2015-07-01');
insert into project_milestone(project_id,milestone_id,completed_date)values(1,8,'2015-08-01');
insert into project_milestone(project_id,milestone_id,completed_date)values(1,9,'2015-09-01');
insert into project_milestone(project_id,milestone_id,completed_date)values(1,10,'2015-10-01');

insert into project_milestone(project_id,milestone_id,completed_date)values(2,1,'2015-01-15');
insert into project_milestone(project_id,milestone_id,completed_date)values(2,2,'2015-02-15');
insert into project_milestone(project_id,milestone_id,completed_date)values(2,3,'2015-03-15');
insert into project_milestone(project_id,milestone_id,completed_date)values(2,4,'2015-04-15');
insert into project_milestone(project_id,milestone_id,completed_date)values(2,5,'2015-05-15');
insert into project_milestone(project_id,milestone_id,completed_date)values(2,6,'2015-06-15');
insert into project_milestone(project_id,milestone_id,completed_date)values(2,7,'2015-07-15');
insert into project_milestone(project_id,milestone_id,completed_date)values(2,8,'2015-08-15');
insert into project_milestone(project_id,milestone_id,completed_date)values(2,9,'2015-09-15');
insert into project_milestone(project_id,milestone_id,completed_date)values(2,10,'2015-10-15');

If you want find the nth largest date value grouped by project, you can use the below query. Replace pm.seqnum = 1 by the nth value required. For example to find the 2nd largest date use pm.seqnum = 2

select p.project_id, pm.milestone_id, pm.completed_date
from project p left join
(select pm.*,
row_number() over (partition by project_id order by completed_date desc) as seqnum
from project_milestone pm
where pm.completed_date is not null
) pm
on p.project_id = pm.project_id and pm.seqnum = 1;