There are a huge number of tables that you can use to query medical history. This tutorial will take you through some of the tables and give you some example scripts that you can use in python or R to access the data, or using R in an interactive session on CloudOS, to access the data.
The advantage of querying medical history programmatically is that the underlying tables are exposed, allowing you to verify the data in bulk. This also allows you to do very complex queries. You can save and tweak scripts to re-use in new releases or for similar queries.
To access medical history, you might query the following tables:
Table
Full name
Details
hes_apc
Hospital episode statistics - Admitted patient care
Details of any overnight stays in hospital
hes_op
Hospital episode statistics - Outpatients
Details of any planned day appointments in hospital
hes_ae
Hospital episode statistics - Accident and emergency
Details of unplanned walk-in/ambulance visits to hospital
hes_cc
Hospital episode statistics - Critical care
Details of any time spent on life support
ecds
Emergency care dataset
Details of unplanned walk-in/ambulance visits to hospital
did
Diagnostic imaging metadata
Details of any medical imaging
The tables for medical history are extensive and were designed for medical record keeping, not for scientific research. They may have a lot of fields not relevant to research, or that are confusing. We recommend referring to the latest data dictionary for full details of these tables.
For your convenience, concatenations of some of the diagnosis and treatment columns, can make them easier to query. All the example code makes use of these concatenated columns, rather than the source data.
Some of the tables have fields which link them together, allowing you to see how medical events relate to each other. However many of the tables can only be linked through participant IDs and corresponding dates.
Tables
Column(s) linking them
Details
hes_apc and hes_ae
aekey and epikey
Where an accident and emergency visit results in the participant being admitted to an overnight stay
hes_apc and hes_cc
susrecid
Where a participant spends some time on life-support during an overnight stay
Dates in tables
Many of the tables have multiple date fields, signifying different events. We have highlighted that fields we think are the most useful.
Table
Field
Meaning
Most useful date in the table?
did
did_date1
The date of referral
did
did_date2
The date the referral was received
did
did_date3
The date of imaging
yes
did
did_date4
The date of the report
hes_ae
appdate
Record updates
hes_ae
arrivaldate
the date of arrival
yes
hes_ae
cdsextdate
Record updates
hes_ae
subdate
Record updates
hes_ae
suslddate
Record updates
hes_apc
apcend_1 - apcend_9
End date of periods of augmented care
hes_apc
apcstar_1 - apcstar_9
Start date of periods of augmented care
hes_apc
admidate
Date of admission
yes
hes_apc
anasdate
Date of first antenatal assessment
hes_apc
cdsextdate
Record updates
hes_apc
disdate
Date of discharge
yes
hes_apc
disreadydate
Date the participant was medically ready for discharge
hes_apc
elecdate
Date of decision to admit
hes_apc
epiend
Date of end of care under a particular consultant
hes_apc
epistart
Date of start of care under a particular consultant
hes_apc
opdate_01 - opdate_24
Date of corresponding numbered treatment in columns oper_01 - oper_24
hes_apc
rtt_per_end
End date of a referral treatment period
hes_apc
rtt_per_start
Start date of a referral treatment period
hes_apc
subdate
Record updates
hes_apc
suslddate
Record updates
hes_cc
ccdisdate
Date of discharge
yes
hes_cc
ccstartdate
Date of start
yes
hes_op
apptdate
Date of appointment
yes
hes_op
dnadate
Date of missed appointment
hes_op
reqdate
Date referral received
hes_op
rtt_per_end
End date of a referral treatment period
hes_op
rtt_per_start
Start date of a referral treatment period
hes_op
subdate
Record updates
hes_op
suslddate
Record updates
ecds
arrival_date
Date of arrival
yes
ecds
assessment_date
Date of assessment
yes
ecds
conclusion_date
Date of conclusion of treatment in accident and emergency
ecds
departure_date
Date of leaving accident and emergency
yes
ecds
injury_date
Date of injury
yes
ecds
investigation_date_01 - investigation_date_12
Date of corresponding numbered investigation in columns investigation_code_01 - investigation_code_12
Before you get started with using the LabKey APIs, you need to ensure you have it set up correctly, following this document. We recommend running your LabKey API scripts on the HPC, as detailed here, as they can be quite large.
Before you get started with using the LabKey APIs, you need to ensure you have it set up correctly, following this document. We recommend running your LabKey API scripts on the HPC, as detailed here, as they can be quite large.
To query our tables using CloudOS, you will need to set up an interactive session. You will need to use conda to install the required R packages:
LabKey has an underlying SQL database. To access it you will need to have the labkey module/library loaded. You will also be working with tables and dataframes. Here you can see the modules/libraries you should load in your scripts:
We have here a helper function called labkey_to_df that you can use to access the LabKey API in the RE and return data as a dataframe. You need to give it:
sql_query: an SQL query to access the LabKey tables
database: the version of the database you want to access for example /main-programme/main-programme_v18_2023-12-21
maxrows: maximum number of rows you want to return. This parameter defaults to 100000000, but if the output contains exactly the default value, then we suggest increasing this value.
Feel free to include this function in all your scripts and invoke it every time you want to query the data.
tidyup function for making diagnosis and treatment columns more readable¶
Many of the queries presented in this tutorial will pull up columns of concatenated data, such as diag_all and opertn_all. These are presented in the form:
code1|code2|code3||||||||
In this example above, code1 etc represent an ICD10 or opcs code. Each vertical pipe | represents one of the columns that were concatenated together; where these columns were empty, there are no spaces between the pipes.
The following function uses a regex to tidy up these to comma separated values, deleting any empty fields. It takes a dataframe as input and returns the tidied-up dataframe. You can use this any time you want to view the output of these analyses.
Get details from a single hes_* table or ecds for a participant¶
We're going to find all diagnoses for an individual from a hes_* table. In this case we're going to query hes_ae. If you want to query hes_apc or hes_op, you would need to change the date field to one(s) available in those tables; you might also want to add further diagnosis and treatment columns.
In this and all following code we will refer to a string part_id, which is the participant_id of the participant of interest. Assume that you have already defined this string in your code.
hes_sql<-paste("SELECT participant_id, arrivaldate, diag_all FROM hes_ae WHERE participant_id = ",part_id,sep="")hes_query<-labkey_to_df(hes_sql,version,1000)
hes_sql<-paste("SELECT participant_id, arrivaldate, diag_all FROM hes_ae WHERE participant_id = ",part_id,sep="")hes_query<-labkey_to_df(hes_sql,version,100000)
The table ecds covers emergency admissions after 2017. The ecds table has different column headings, so we have to alter our queries accordingly. The following query gets all the diagnoses, treatments and investigations for a participant, with the date of arrival.
diag_table<-data.frame()ae_sql<-paste("SELECT participant_id, arrivaldate as date, diag_all FROM hes_ae WHERE participant_id = ",part_id,sep="")ae_query<-labkey_to_df(ae_sql,version,1000)if (nrow(ae_query)>0){ae_query$table<-'hes_ae - A&E diagnosis code'diag_table<-rbind(diag_table,ae_query)}apc_sql<-paste("SELECT participant_id, admidate as date, diag_all FROM hes_apc WHERE participant_id = ",part_id,sep="")apc_query<-labkey_to_df(apc_sql,version,1000)if (nrow(apc_query)>0){apc_query$table<-'hes_apc - ICD10 code'diag_table<-rbind(diag_table,apc_query)}op_sql<-paste("SELECT participant_id, apptdate as date, diag_all FROM hes_op WHERE participant_id = ",part_id,sep="")op_query<-labkey_to_df(op_sql,version,1000)if (nrow(op_query)>0){op_query$table<-'hes_op - ICD10 code'diag_table<-rbind(diag_table,op_query)}ecds_sql<-paste("SELECT participant_id, seen_date as date, diagnosis_code_all as diag_all FROM ecds WHERE participant_id = ",part_id,sep="")ecds_query<-labkey_to_df(ecds_sql,version,1000)
diag_table<-data.frame()ae_sql<-paste("SELECT participant_id, arrivaldate as date, diag_all FROM hes_ae WHERE participant_id = ",part_id,sep="")ae_query<-labkey_to_df(ae_sql,version,1000)if (nrow(ae_query)>0){ae_query$table<-'hes_ae - A&E diagnosis code'diag_table<-rbind(diag_table,ae_query)}apc_sql<-paste("SELECT participant_id, admidate as date, diag_all FROM hes_apc WHERE participant_id = ",part_id,sep="")apc_query<-labkey_to_df(apc_sql,version,1000)if (nrow(apc_query)>0){apc_query$table<-'hes_apc - ICD10 code'diag_table<-rbind(diag_table,apc_query)}op_sql<-paste("SELECT participant_id, apptdate as date, diag_all FROM hes_op WHERE participant_id = ",part_id,sep="")op_query<-labkey_to_df(op_sql,version,1000)if (nrow(op_query)>0){op_query$table<-'hes_op - ICD10 code'diag_table<-rbind(diag_table,op_query)}ecds_sql<-paste("SELECT participant_id, seen_date as date, diagnosis_code_all as diag_all FROM ecds WHERE participant_id = ",part_id,sep="")ecds_query<-labkey_to_df(ecds_sql,version,1000)
Instead of using our tidyup function, we want to split the table to have one diagnosis per row
library(ggplot2)ggplot(data=diag_table,aes(x=date,y=diag_all,colour=table))+geom_point()+labs(x="Date of diagnosis",y="Codes",title="Diagnoses over time")
plt.plot_date(diag_table['date'],diag_table['diag_all'],xdate=True,ydate=False)plt.xticks(rotation=70)groups=diag_table.groupby('table')forname,groupingroups:plt.plot(group.date,group.diag_all,marker='o',linestyle='',markersize=3,label=name)plt.legend()plt.title('Diagnoses over time',fontweight="bold")plt.xlabel('Date of diagnosis')plt.ylabel('ICD10 or SNOMED CT codes')plt.show()
library(ggplot2)ggplot(data=diag_table,aes(x=date,y=diag_all,colour=table))+geom_point()+labs(x="Date of diagnosis",y="Codes",title="Diagnoses over time")
invest_table<-data.frame()ae_invest_sql<-paste("SELECT participant_id, arrivaldate as date,invest_all FROM hes_ae WHERE participant_id = ",part_id,sep="")ae_invest_query<-labkey_to_df(ae_invest_sql,version,1000)if (nrow(ae_invest_query)>0){ae_invest_query$table<-'hes_ae - A&E investigation code'invest_table<-rbind(invest_table,ae_invest_query)}did_invest_sql<-paste("SELECT participant_id, did_date3 as date, did_snomedct_code invest_all FROM did WHERE participant_id = ",part_id,sep="")did_invest_query<-labkey_to_df(did_invest_sql,version,1000)if (nrow(did_invest_query)>0){did_invest_query$table<-'did _SNOMED CT code'invest_table<-rbind(invest_table,did_invest_query)}ecds_invest_sql<-paste("SELECT participant_id, seen_date as date, investigation_code_all as invest_all FROM ecds WHERE participant_id = ",part_id,sep="")ecds_invest_query<-labkey_to_df(ecds_invest_sql,version,1000)
ae_invest_sql=(f''' SELECT participant_id, arrivaldate as date, invest_all FROM hes_ae WHERE participant_id = {part_id} ''')ae_invest_query=labkey_to_df(ae_invest_sql,version,1000)ae_invest_query['table']='hes_ae - A&E investigation code'did_invest_sql=(f''' SELECT participant_id, did_date3 as date, did_snomedct_code invest_all FROM did WHERE participant_id = {part_id} ''')did_invest_query=labkey_to_df(did_invest_sql,version,1000)did_invest_query['table']='did - SNOMED CT code'ecds_invest_sql=(f''' SELECT participant_id, seen_date as date, investigation_code_all as invest_all FROM ecds WHERE participant_id = {part_id} ''')ecds_invest_query=labkey_to_df(ecds_invest_sql,version,1000)ecds_invest_query['table']='ecds - SNOMED CT'invest_table=pd.concat([ae_invest_query,did_invest_query,ecds_invest_query])invest_table
invest_table<-data.frame()ae_invest_sql<-paste("SELECT participant_id, arrivaldate as date,invest_all FROM hes_ae WHERE participant_id = ",part_id,sep="")ae_invest_query<-labkey_to_df(ae_invest_sql,version,1000)if (nrow(ae_invest_query)>0){ae_invest_query$table<-'hes_ae - A&E investigation code'invest_table<-rbind(invest_table,ae_invest_query)}did_invest_sql<-paste("SELECT participant_id, did_date3 as date, did_snomedct_code invest_all FROM did WHERE participant_id = ",part_id,sep="")did_invest_query<-labkey_to_df(did_invest_sql,version,1000)if (nrow(did_invest_query)>0){did_invest_query$table<-'did _SNOMED CT code'invest_table<-rbind(invest_table,did_invest_query)}ecds_invest_sql<-paste("SELECT participant_id, seen_date as date, investigation_code_all as invest_all FROM ecds WHERE participant_id = ",part_id,sep="")ecds_invest_query<-labkey_to_df(ecds_invest_sql,version,1000)
We can process and plot this in the same way as we did for the diagnoses.
treat_table<-data.frame()ae_treat_sql<-paste("SELECT participant_id, arrivaldate as date, treat_all FROM hes_ae WHERE participant_id = ",part_id,sep="")ae_treat_query<-labkey_to_df(ae_treat_sql,version,1000)if (nrow(ae_treat_query)>0){ae_treat_query$table<-'hes_ae - A&E treatment code'treat_table<-rbind(treat_table,ae_treat_query)}apc_treat_sql<-paste("SELECT participant_id, admidate as date, opertn_all as treat_all FROM hes_apc WHERE participant_id = ",part_id,sep="")apc_treat_query<-labkey_to_df(apc_treat_sql,version,1000)if (nrow(apc_treat_query)>0){apc_treat_query$table<-'hes_apc - OPCS4'treat_table<-rbind(treat_table,apc_treat_query)}op_treat_sql<-paste("SELECT participant_id, apptdate as date, opertn_all as treat_all FROM hes_op WHERE participant_id = ",part_id,sep="")op_treat_query<-labkey_to_df(op_treat_sql,version,1000)if (nrow(op_treat_query)>0){op_treat_query$table<-'hes_op - OPCS4'treat_table<-rbind(treat_table,op_treat_query)}ecds_treat_sql<-paste("SELECT participant_id, seen_date as date, treatment_code_all as treat_all FROM ecds WHERE participant_id = ",part_id,sep="")ecds_treat_query<-labkey_to_df(ecds_treat_sql,version,1000)
ae_treat_sql=(f''' SELECT participant_id, arrivaldate as date, treat_all FROM hes_ae WHERE participant_id = {part_id} ''')ae_treat_query=labkey_to_df(ae_treat_sql,version,1000)ae_treat_query['table']='hes_ae - A&E treatment code'apc_treat_sql=(f''' SELECT participant_id, admidate as date, opertn_all as treat_all FROM hes_apc WHERE participant_id = {part_id} ''')apc_treat_query=labkey_to_df(apc_treat_sql,version,1000)apc_treat_query['table']='hes_apc -OPCS4'op_treat_sql=(f''' SELECT participant_id, apptdate as date, opertn_all as treat_all FROM hes_op WHERE participant_id = {part_id} ''')op_treat_query=labkey_to_df(op_treat_sql,version,1000)op_treat_query['table']='hes_op - OPCS4'ecds_treat_sql=(f''' SELECT participant_id, seen_date as date, treatment_code_all as treat_all FROM ecds WHERE participant_id = {part_id} ''')ecds_treat_query=labkey_to_df(ecds_treat_sql,version,1000)ecds_treat_query['table']='ecds - SNOMED CT'treat_table=pd.concat([ae_treat_query,apc_treat_query,op_treat_query,ecds_treat_query])
treat_table<-data.frame()ae_treat_sql<-paste("SELECT participant_id, arrivaldate as date, treat_all FROM hes_ae WHERE participant_id = ",part_id,sep="")ae_treat_query<-labkey_to_df(ae_treat_sql,version,1000)if (nrow(ae_treat_query)>0){ae_treat_query$table<-'hes_ae - A&E treatment code'treat_table<-rbind(treat_table,ae_treat_query)}apc_treat_sql<-paste("SELECT participant_id, admidate as date, opertn_all as treat_all FROM hes_apc WHERE participant_id = ",part_id,sep="")apc_treat_query<-labkey_to_df(apc_treat_sql,version,1000)if (nrow(apc_treat_query)>0){apc_treat_query$table<-'hes_apc - OPCS4'treat_table<-rbind(treat_table,apc_treat_query)}op_treat_sql<-paste("SELECT participant_id, apptdate as date, opertn_all as treat_all FROM hes_op WHERE participant_id = ",part_id,sep="")op_treat_query<-labkey_to_df(op_treat_sql,version,1000)if (nrow(op_treat_query)>0){op_treat_query$table<-'hes_op - OPCS4'treat_table<-rbind(treat_table,op_treat_query)}ecds_treat_sql<-paste("SELECT participant_id, seen_date as date, treatment_code_all as treat_all FROM ecds WHERE participant_id = ",part_id,sep="")ecds_treat_query<-labkey_to_df(ecds_treat_sql,version,1000)
Match accident and emergency episodes to overnight admissions¶
Where accident and emergency admissions resulted in the participant being admitted for overnight care, the entries in the hes_ae and hes_apc tables are linked by the aekey and epikey. The following query finds all diagnoses and operations the participant received in both A&E and after admission to overnight care. Note that in this code we rename the column headers, since many of these are the same between hes_ae and hes_apc.
ae_apc_sql<-paste("SELECT ae.participant_id, ae.arrivaldate as ae_arrival, ae.diag_all as ae_diag, apc.admidate as apc_arrival, apc.disdate as apc_discharge, apc.opertn_all as apc_operation, apc.diag_all as apc_diag FROM hes_ae as ae, hes_apc as apc WHERE ae.epikey = apc.epikey AND ae.participant_id = ",part_id,sep="")ae_apc_query<-labkey_to_df(ae_apc_sql,version,1000)
ae_apc_sql=(f''' SELECT ae.participant_id, ae.arrivaldate as ae_arrival, ae.diag_all as ae_diag, apc.admidate as apc_arrival, apc.disdate as apc_discharge, apc.opertn_all as apc_operation, apc.diag_all as apc_diag FROM hes_ae as ae, hes_apc as apc WHERE ae.epikey = apc.epikey AND ae.participant_id = {part_id} ''')ae_apc_query=labkey_to_df(ae_apc_sql,version,1000)
ae_apc_sql<-paste("SELECT ae.participant_id, ae.arrivaldate as ae_arrival, ae.diag_all as ae_diag, apc.admidate as apc_arrival, apc.disdate as apc_discharge, apc.opertn_all as apc_operation, apc.diag_all as apc_diag FROM hes_ae as ae, hes_apc as apc WHERE ae.epikey = apc.epikey AND ae.participant_id = ",part_id,sep="")ae_apc_query<-labkey_to_df(ae_apc_sql,version,100000)
Find critical care periods within an overnight stay¶
Critical care periods in the hes_cc table all fall within periods of overnight care in hes_apc. These are linked together by the susrecid column.
apc_cc_sql<-paste("SELECT apc.participant_id, apc.admidate as apc_arrival, apc.disdate as apc_discharge, apc.opertn_all as apc_operation, apc.diag_all as apc_diag, cc.ccstartdate as cc_start, cc.ccdisdate as cc_discharge FROM hes_apc as apc, hes_cc as cc WHERE apc.susrecid = cc.susrecid AND apc.participant_id = ",part_id,"ORDER BY apc.admidate",sep="")apc_cc_query<-labkey_to_df(apc_cc_sql,version,1000)
apc_cc_sql=(f''' SELECT apc.participant_id, apc.admidate as apc_arrival, apc.disdate as apc_discharge, apc.opertn_all as apc_operation, apc.diag_all as apc_diag, cc.ccstartdate as cc_start, cc.ccdisdate as cc_discharge FROM hes_apc as apc, hes_cc as cc WHERE apc.susrecid = cc.susrecid AND apc.participant_id = {part_id} ORDER BY apc.admidate ''')apc_cc_query=labkey_to_df(apc_cc_sql,version,1000)
apc_cc_sql<-paste("SELECT apc.participant_id, apc.admidate as apc_arrival, apc.disdate as apc_discharge, apc.opertn_all as apc_operation, apc.diag_all as apc_diag, cc.ccstartdate as cc_start, cc.ccdisdate as cc_discharge FROM hes_apc as apc, hes_cc as cc WHERE apc.susrecid = cc.susrecid AND apc.participant_id = ",part_id,"ORDER BY apc.admidate",sep="")apc_cc_query<-labkey_to_df(apc_cc_sql,version,100000)
Pull out time periods from medical history tables¶
Other tables, such as the hes_op table of Outpatients day appointments at a hospital, and did table of diagnostic imaging, do not have explicit links to other tables. These can only be linked to other medical history tables by the dates of events.
The following code pulls out the start and end dates of periods of admitted patient care where a diagnosis was made (in this case R06, abnormalities of breathing). We will pull dates out of the table apc_cc_query that we created above.
diag<-'R06'episodes<-apc_cc_query%>%filter(grepl(diag,apc_diag))for(iin1:nrow(episodes)){start_date<-episodes[i,"apc_arrival"]end_date<-episodes[i,"apc_discharge"]# carry out a query using the start and end date}
diag='R06'episodes=apc_cc_query[apc_cc_query['apc_diag'].str.contains(diag)]foriinrange(len(episodes)):start_date=episodes.iloc[i]['apc_arrival']end_date=episodes.iloc[i]['apc_discharge']# carry out a query using the start and end date
diag<-'R06'episodes<-apc_cc_query%>%filter(grepl(diag,apc_diag))for(iin1:nrow(episodes)){start_date<-episodes[i,"apc_arrival"]end_date<-episodes[i,"apc_discharge"]# carry out a query using the start and end date}
Alternatively, you may wish to take a start or end date from one event and find all other events within a period of time before or after that event.
diag<-'R06'episodes<-apc_cc_query%>%filter(grepl(diag,apc_diag))for(iin1:nrow(episodes)){start_date<-episodes[i,"apc_discharge"]end_date<-as.Date(start_date)+90# carry out a query using the start and end date}
diag='R06'episodes=apc_cc_query[apc_cc_query['apc_diag'].str.contains(diag)]foriinrange(len(episodes)):start_date=episodes.iloc[i]['discharge']end_date=datetime.strptime(start_date,'%Y-%m-%d')+pd.DateOffset(days=90)# carry out a query using the start and end date
diag<-'R06'episodes<-apc_cc_query%>%filter(grepl(diag,apc_diag))for(iin1:nrow(episodes)){start_date<-episodes[i,"apc_discharge"]end_date<-as.Date(start_date)+90# carry out a query using the start and end date}
We can use the dates we have pulled out to find all events in other tables that occurred within that time period. In the following query, we will query the did table of diagnostic imaging for rows where the imaging date did_date3 falls within the hospital stay.
To match up rows from did to rows from our hes_* tables, we will create our own index called did_index.
row_counter<-1did_combined<-data.frame()for(iin1:nrow(episodes)){start_date<-episodes[i,"apc_arrival"]end_date<-episodes[i,"apc_discharge"]episodes$did_index[i]<-row_counterdid_sql<-paste("SELECT participant_id, did_date3, ic_region_desc, ic_snomedct_desc, ic_sub_sys_desc, ic_nicip_desc FROM did WHERE participant_id = ",part_id," AND did_date3 >= '",start_date,"' AND did_date3 <= '",end_date,"'",sep="")did_query<-labkey_to_df(did_sql,version,1000)did_query$did_index<-row_counterdid_combined<-rbind(did_combined,did_query)row_counter<-row_counter+1}episodes<-merge(episodes,did_combined,by=c("did_index","participant_id"))
row_counter=1did_combined=pd.DataFrame()foriinrange(len(episodes)):start_date=episodes.iloc[i]['apc_arrival']end_date=episodes.iloc[i]['apc_discharge']episodes.loc[episodes.index[i],'did_index']=row_counterdid_sql=(f''' SELECT participant_id, did_date3, ic_region_desc, ic_snomedct_desc, ic_sub_sys_desc, ic_nicip_desc FROM did WHERE participant_id = {part_id} AND did_date3 >= '{start_date}' AND did_date3 <= '{end_date}' ''')did_query=labkey_to_df(did_sql,version,1000)did_query['did_index']=row_counterdid_combined=pd.concat([did_query,did_combined])row_counter+=1episodes=pd.merge(episodes,did_combined,on=['did_index','participant_id'])
row_counter<-1did_combined<-data.frame()for(iin1:nrow(episodes)){start_date<-episodes[i,"apc_arrival"]end_date<-episodes[i,"apc_discharge"]episodes$did_index[i]<-row_counterdid_sql<-paste("SELECT participant_id, did_date3, ic_region_desc, ic_snomedct_desc, ic_sub_sys_desc, ic_nicip_desc FROM did WHERE participant_id = ",part_id," AND did_date3 >= '",start_date,"' AND did_date3 <= '",end_date,"'",sep="")did_query<-labkey_to_df(did_sql,version,100000)did_query$did_index<-row_counterdid_combined<-rbind(did_combined,did_query)row_counter<-row_counter+1}episodes<-merge(episodes,did_combined,by=c("did_index","participant_id"))