Power Query Course
★ Capstone Project
Part 1 of 3

Loading Multiple Data Types

Work as a data analyst at a healthcare organization. Your first task: load three different data sources — an Excel file, a plain-text file, and a PDF — each requiring a different import approach.

~40 minutes
Level: Intermediate
3 data files

Scenario

You're a newly hired data analyst at a healthcare organization. Three datasets have landed on your desk — each in a different format. Power Query can handle all of them. This part focuses on the import stage: connecting to each source, removing noise, and getting the data into a clean, typed, loadable state.

Lab Files — Part 1
dailycensus_7665.xlsxDaily activity volumes by hospital ID
Download
drug_administrations.txtDrug prescription records — flat text format
Download
Patient_Experience_Survey_Template.pdf11-question patient survey — PDF format
Download
Capstone Instructions
StepsDirections
Section 1 — Daily Census: Excel Import
Open a blank Excel workbook and launch the Power Query editor.
Load dailycensus_7665.xlsx (sheet: dailycensus) into Power Query.Get DataFrom FileFrom Workbook
Remove the first row.HomeRemove RowsRemove Top Rows
Promote the first row as headers.HomeUse First Row as Headers
Ensure effective_date is set to Date type.
?
Checkpoint 1What was the last effective date in the dataset?
StepsDirections
Section 2 — Drug Administrations: Text File Import
Load drug_administrations.txt into a new query.HomeNew SourceFileText/CSV
Keep only prescription_date and prescribing_doctor_id — remove all other columns.Select both columnsHomeRemove ColumnsRemove Other Columns
Rename this query: prescription_dates_by_doctor.Right-click the query name in the Queries pane
Close and load the data into the Excel workbook and view the resulting table.HomeClose & Load
In an empty Excel cell, use COUNTIF() to count the total rows attributed to doctor "D0987".Use the Excel sheet — not Power Query
?
Checkpoint 2Doctor D0987 is expected to account for 5% of total prescriptions. What is their actual proportion?
StepsDirections
Section 3 — Patient Survey: PDF Import
Load Patient_Experience_Survey_Template.pdf into a new query.Get DataFrom FileFrom PDF
In the Navigator window, preview each option and select the table that contains all 11 survey questions.
Keep only the first and last columns — remove everything in between.Select both columnsRemove Other Columns
Rename the two columns: question and questionid.Double-click column headers to rename
Re-order columns so that questionid appears first.Drag column header, or Transform › Move
Rename the query: PatientExperienceSurveyQuestions.
?
Checkpoint 3What is the M formula shown in the last applied step of the PatientExperienceSurveyQuestions query?
PDF import tip: Power Query detects tables from the document structure, and the Navigator will show multiple preview options. Always check each one — the correct table is usually the one that shows the cleanest row-per-question layout. Column headers from PDFs are often auto-generated, so renaming is always needed.