★ Capstone Project
Part 3 of 3
Transformations & Analysis
Extract insights from cleaned data — text transformations, age calculations, arithmetic derived columns, occupancy percentages, and finally transpose a reshaped census file.
Scenario
The data is clean — now it's time to enrich it. You'll extract partial text from IDs and diagnosis codes, compute patient ages from dates, derive occupancy percentages from census figures, and handle a hospital dataset that was delivered in transposed (row-as-column) format. Each section produces output ready for pivot table analysis in Excel.
Lab Files — Part 3
Capstone Instructions
| Steps | Directions |
|---|---|
| Section 8 — Text Transformations (Patient Satisfaction query) | |
| Replace patient_id column values to show only the last 6 characters of each ID. | |
| Create a new column with just the first character of the diagnosis column. Name it diagnosis_category. | |
| Close and load the query into the Excel workbook. Create a pivot table from the patient_satisfaction_results table. | , then Insert › PivotTable |
Checkpoint 8Using the pivot table, which diagnosis category group had the lowest average response score for Question 1?
| Steps | Directions |
|---|---|
| Section 9 — Calculating Age Between Two Dates | |
| In the Power Query editor, transform survey_date to show Date only — remove the time component. | |
| Add a new column using Subtract Days to subtract birth_date from survey_date — giving age in days. | |
| Convert the age-in-days column to years by dividing by 365. | |
| Round the column to 0 decimal places. | |
| Rename the column age_calc. | |
Checkpoint 9Investigate the source of missing patient_age values. Which hospital site is largely associated with the missing age data?
| Steps | Directions |
|---|---|
| Section 10 — Numeric Transformations (Daily Census query) | |
| Select the dailycensus_7665 query. | |
| Add a column that calculates discharges minus admissions. Rename it net_difference. | |
| From net_difference, derive a new column that extracts the sign of each value (-1, 0, or 1). | |
| Close and load this query to the Excel workbook. | |
Checkpoint 10Filter the loaded table in Excel to Sign = -1 (negative flow days). What is the average net difference on those days? Round to the nearest whole number.
| Steps | Directions |
|---|---|
| Section 11 — Occupancy, Month & Year Columns | |
| With the dailycensus_7665 query selected, add a column that divides patient_days by budgeted_beds. | |
| Convert the new column to percentage format, then rename it occupancy. | , then set type to Decimal |
| Add a new column to extract the Year from effective_date. | |
| Add another column to extract the Month Name from effective_date. | |
| Shorten the Month Name column to display only the first 3 characters (e.g. January → Jan). | |
| Close and load the query. Insert a pivot table from the dailycensus table — Average of occupancy with Month Name in Rows and Year in Columns. | Insert › PivotTable, then configure values and axes |
Checkpoint 11Looking at the pivot table, how many years of daily census data showed December as the month with the highest average occupancy?
| Steps | Directions |
|---|---|
| Section 12 — Transposing a Reshaped Census File | |
| Load dailycensus_2259.xlsx (sheet: dailycensus) into a new query. Name the query dailycensus_2259. | |
| Transpose the dataset so rows become columns. | |
| Promote the first row as headers. | |
| Set effective_date to Date type. | |
| Add a column that calculates patient_days minus budgeted_beds. Label it patientdaysvsbudgeted. | |
| Filter the query to show only rows where patientdaysvsbudgeted is greater than 0. | Use the column header filter dropdown |
Checkpoint 12 — Final QuestionThe operations lead for Hospital 2259 is considering adding one more budgeted bed. Among days where patient days exceeded budgeted beds, what percentage had a net variance (patientdaysvsbudgeted) of exactly 1?
When to use Transpose: Data sometimes arrives with dates as column headers and metrics as rows — the opposite of what Power Query (and pivot tables) expect. Transpose flips the entire table, turning rows into columns and vice versa. Always follow it immediately with "Use First Row as Headers" since the column names end up in row 1 after the flip.