生活中的一天

  • 版本 :2022.1 及更高版本

塑造数据意味着什么?这对可以构建的可视化效果以及可以执行的分析有何影响?在下面的教程中,我们将探索分析和可视化方案,确定阻碍我们的数据限制,然后了解 Tableau Prep 如何帮助我们调整数据以达到预期结果。 下载数据集,并使用 Tableau Prep 和 Tableau Desktop 在生活场景中关注这些日子。了解如何应用 Tableau Prep 中的特性和功能,以便为在 Tableau Desktop 中分析数据做好准备。

达到医院的能力是有问题的,但资源过剩也是如此。从病床作为资源的角度来理解医院病床非常重要。但是,数据通常是从患者的角度存储的。我们如何获取捕获患者在床上的时间并确定床位使用情况的数据?

注意:要完成这些教程中的任务,您需要安装 Tableau Prep 和 Tableau Desktop(可选):

要安装 Tableau Prep 和 Tableau Desktop,请参阅
Tableau Desktop 和 Tableau Prep Deployment Guide(链接在新窗口中打开).否则,您可以下载Tableau Prep(链接在新窗口中打开)Tableau Desktop(链接在新窗口中打开)免费试用。

您还需要下载三个数据文件。建议将它们保存在“我的 Tableau 准备存储库>数据源”文件夹中。

- 床.xlsx(链接在新窗口中打开)
- 营业时间.xlsx(链接在新窗口中打开)
- 病床.xlsx(链接在新窗口中打开)

数据

对于我们的四张病床,A,B,C和D,我们跟踪哪个病人在床上,以及他们在那里的开始和结束时间。数据如下所示:

病床数据集预览

初步分析

如果我们将这些数据引入 Tableau Desktop,则可以创建一个甘特图来显示患者何时躺在床上。

病床数据集的 Tableau 桌面中的甘特图

这是一个有用的视觉对象。我们可以看到,床位A和B的使用只有很小的间隙,但床C的使用非常不足。Bed D的病人没有结束时间,但我们可以通过一些计算来解决这个问题。这为我们提供了如何使用床的直观概述。

但是,如果我们想计算一张床空着的小时数呢?或者比较新政策实施前后的开放就寝时间?没有简单的方法可以做到这一点,因为它目前是结构化的。

所需的数据结构

通过创建一些非常基本的数据集并在 Tableau Prep 中将它们组合在一起,我们可以将此数据集修改为一种形式,以便我们执行更深入的分析并创建更有用的可视化效果。

在进入 Tableau Prep 之前,让我们退后一步,想想我们需要创建什么来回答“每张床空了多少小时”这个问题。

我们需要能够查看每个小时的每张病床,并知道病床上是否有病人。目前,数据仅在患者躺在床上时;我们尚未向 Tableau 提供有关空闲时间的信息。

为了创建所有床位和所有时间的完整矩阵,我们将创建两个新的数据集。一个只是床位列表(A,B,C,D),另一个是小时列表(1,2,3,...,23,24)。通过执行交叉联接(将一个数据集中的每一行与另一个数据集中的每一行联接),我们最终将获得床位和小时数的每个可能组合。

TheBeds.xlsx data set looks like this:The Hours.xlsx data set looks like this:And the cross joined results look like this:
Preview of the Beds data setPreview of the Hours data setPreview of the Bed-Hour matrix data

Next, we'll bring in the Patient Beds information, labeling each bed-hour combination as having a specific patient or not. We wind up with a data set that has a row for each bed-hour, and if a patient was in the bed, their number and start and end times. Null values indicate the bed was unoccupied.

Preview of the Bed-Hour-Patient matrix data

With the data in this structure, we can perform analyses like this, which enables us to investigate unoccupied beds as easily as patient beds.

Dashboard of shaped data showing bed use by hour and patient

Restructuring the Data

So how do we get there with Tableau Prep? We'll build out the flow in two parts, first building the Bed Hours matrix, then combining it with the Patient Beds data. Make sure you've downloaded all three Excel files (Beds.xlsx, Hours.xlsx, and Patient Beds.xlsx) to follow along.

Bed Hour Matrix

First, we'll connect to the Beds.xlsx file.

  1. Open Tableau Prep.

  2. From the start screen, click Connect to Data.

  3. On the Connections pane, click Microsoft Excel. Navigate to where you saved Beds.xlsx and click Open.

  4. The Beds sheet should automatically be brought out to the Flow pane.

Tip: For more information about connecting to data, see Connect to Data(Link opens in a new window).

Next, we need to create a field we can use to do the cross join with the Hours data set. We'll add a calculation that is simply the value 1.

  1. In the Flow pane, select Beds and click the suggested Clean Step.

  2. With the Clean step we just added, the Profile pane will come up. Click Create Calculated Field in the toolbar.

  3. Name the field Cross Join and enter the value 1.

  4. The Data grid should update show the current state of the data.

Now we'll repeat the process with the Hours data set.

Click for directions

Both data sets now have a shared field, Cross Join, and can be joined.

  1. Join the two cleaning steps by dragging Clean 2 onto Clean 1 and dropping it on the Join option.

    The flow pane showing the Join drop area when Clean 2 is dragged to Clean 1

  2. In the Join Profile below, the join configurations should populate automatically.

    The Join Profile pane with various elements called out to correspond to the list above

    A. Join clause, B. Join type, C. Data grid results

    • Because we named both fields Cross Join, Tableau Prep automatically identifies them as the shared field and creates the appropriate Applied Join Clauses.

    • The default Join Type is inner, which is what we want.

    • This join will match all rows from Beds with all rows from Hours, as seen in the Data grid.

Tip: For more information about joins, see Join your data(Link opens in a new window).

We no longer need the Cross Join fields, so we can remove them.

  1. In the Flow pane, select Join 1, click the plus icon, and select Add Clean Step.

  2. Select the fields Cross Join-1 and Cross Join, and click Remove Fields.

  3. Double click on the Clean 3 label and rename that step Bed Hour Matrix.

We now have the Bed Hour Matrix data set that contains all beds and all hours and have finished the first part of building our data set.

Patient Bed Use

Part two is bringing in the patient bed usage. To start, we'll connect to the data.

  1. On the Connections pane, click the Add connection button to add another data connection.

  2. Choose Microsoft Excel and then select the Patient Beds.xlsx file, and click Open.

  3. In the Flow pane, select Patient Beds, click the plus icon and select Add Clean Step.

Because the Bed Hour Matrix file is based on hour but Patient Beds is based on actual time, we need to pull the hour out of the Patient Beds start and end times. Additionally, for the end time, we want to ensure that if a patent is still in the bed at the end of the day (midnight, hour 24) we indicate that the bed is occupied even though there's no end time in the data set. We'll add a calculated field in this new step.

  1. In the toolbar, click Create Calculated Field.

  2. Name the field Start Hour. For the calculation, enter .DATEPART('hour',[Start Time])

  3. This takes the hour of the start time and pulls it out. Therefore, "1/1/18 9:35 AM" becomes simply "9".

  4. Create another calculated field named End Hour. For the calculation, enter .IFNULL(DATEPART('hour',[End Time]), 24)

  5. The portion takes the hour of the end time. The portion will assign an end time of 24 (midnight) to any missing end time.DATEPARTIFNULL

Now we're ready to join patient bed usage to the Bed Hour Matrix. This is a bit more complex join than we did previously. An inner join would only return values present in both data sets. Because we want to make sure we keep all the bed-hour slots, regardless of whether or not a patient was in the bed, we'll need to do a left join. This will result in a lot of nulls, but that's appropriate.

We also need to match when a bed-hour slot is taken by a patient (or patients). So in addition to matching the bed the patient is in we also need to consider the time. The Bed Hour Matrix data set just has an Hour field, and the Patient Beds data set has Start Hour and End Hour. We'll use some basic logic to determine if a patient should be assigned to a given bed-hour slot: A patient is considered in a bed if their start hour is less than or equal to (<=) the bed-hour slot AND their end hour is greater than or equal to (>=) the bed-hour slot.

Therefore, three join clauses are needed to appropriately match these two data sets together.

  1. Join the Clean 3 step with the Bed Hour Matrix step.

  2. In the Applied Join Clauses area, the default should be Hour = End Hour. Click the join clause to change the operator from "=" to "<=".

  3. Click the plus button in the upper right corner of the Applied Join Clauses area to add another join clause. Set it to be Hour >= Start Hour

  4. Add a third join clause for Bed = Hospital Bed.

  5. In the Join Type section, click the unshaded area of the graphic next to Bed Hour Matrix to change the join type to a Left join.

    The join profile pane for Join 2 showing the correct configuration

Note: If you drag the Bed Hour Matrix to Clean 3 instead of the other way around, the desired results can be obtained by using a right join instead of a left join. The order of dragging the steps matters for the orientation of the join. The join clauses will also be in reverse order—be sure to preserve the correct logic of comparing the hours.

Our data is now joined, but we should clean up some artifacts from the join and make sure the fields are tidy. We no longer need Start Hour and End Hour. Hospital Bed and Bed are also redundant. Finally, a value of null in the Patient field really means the bed is unoccupied.

  1. In the Flow pane, add a cleaning step so we can tidy up the joined data.

  2. Ctrl+click (Command+click on Mac) to multi select the fields End Hour, Start Hour, and Hospital Bed, then click Remove Fields in the toolbar.

  3. On the Patient field profile card, double click the null value and type Unoccupied.

We now have a data structure with a row for every bed-hour; if there was a patient in bed during that hour, we have the patient information as well. All that remains to do is add an output step and generate the data set itself.

  1. In the Flow pane, select Clean 4, click the plus icon, and select Add Output.

  2. In the Output pane, change the Output type to .csv then click Browse.

  3. Enter Bed Hour Patient Matrix for the name and choose the desired location before clicking Accept to save.

  4. Click theRun Flow button at the bottom of the pane to generate your output. Click Done in the status dialog to close the dialog.

Tip: For more information about outputs and running a flow, see Save and Share Your Work(Link opens in a new window).

The final flow should look like this:

The flow pane with all steps completed

Analysis in Tableau Desktop

To install Tableau Desktop before continuing with this tutorial, you can download the free trial.

Now that we have the data set in the desired structure, we can perform deeper analysis than with the original data.

  1. Open Tableau Desktop. In the Connect pane, select Text file, navigate to the Bed Hour Patient Matrix.csv file, and click Open.

  2. On the Data source tab, the data should appear in the canvas by default. Click to Sheet 1.

  3. In the Data pane, drag Hour above the line separating Measures and Dimensions to make it a discrete dimension.

  4. Drag Bed to the Rows shelf and Hour to the Columns shelf.

  5. Drag Patient to the Color shelf.

Basic view of shaped data in Tableau Desktop showing bed use by hour

Formatting is optional, but may help make the visual more readable.

  1. Click on the Color shelf and select Edit Colors.

  2. In the area to the left, select Unoccupied. From the drop down on the right, choose the Seattle Grays color palette.

  3. Select the fourth, lightest gray, and click OK.

  4. Click the Color shelf again, then click the Border dropdown. Choose the second gray option at the far right.

  5. In the toolbar, from the Size dropdown, change from Standard to Fit Width.

    Image of the location of the size dropdown in the toolbar

  6. Click the Format menu and then Borders.

  7. For Row Divider, click the Pane dropdown and choose a very light gray.

  8. Adjust the Level slider to the second tick mark.

  9. Repeat with the Column Divider. Set the Pane color to be light gray and the Level to the second tick mark.

    A snippet showing the format pane with the correct settings for the borders

  10. Double click the sheet tab at the bottom and rename it Bed Use by Hour.

This view lets us quickly see when a given bed was occupied or open.

A view showing the bed use by hour

But we can go further and count the number of hours each bed was unoccupied.

  1. Click the new sheet tab icon at the bottom to open a clean sheet.

  2. Drag Patient to Rows.

  3. Drag Hour to Columns. Right click the Hour pill to open the menu. Choose Measure > Count.

  4. Drag another copy of the Patient field from the Data pane to the Color shelf.

  5. Right click on the axis and select Edit Axis. Change the title to Hours and close the dialog.

  6. Rename the sheet tab Bed Hours by Patient.

A view showing bed hours by patient

This view lets us identify how many unoccupied bed hours we had, something we couldn't do with the original data set. What other charts or dashboards can you create? Give it a try now that your data is in the right structure.

Recap and Resources

To build this data structure using Tableau Prep, we needed to perform the following actions:

  1. Build a data set for each aspect we want to analyze, in this case, Beds and Hours.

  2. Cross join those data sets to create a Bed Hour Matrix data set with every possible combination of beds and hours.

  3. Join the Bed Hour Matrix with the Patient Bed data, making sure the join keeps all bed-slot hours and the join clauses appropriately match patient bed data with the bed-hour slots.

We used the following calculations to create fields we could join on. The second and third pull out the hour information from the original datetime fields.

  • Cross Join = 1

    • 这只会将值 1 分配给每行

  • 开始时间 = DATEPART('hour',[Start Time])

    • 这会占用开始时间的小时并将其拉出。因此,“1/1/18 9:35 AM”变为简单的“9”。

  • 结束时间 = IFNULL(DATEPART('hour',[End Time]), 24)

    • 我们可以使用 ,就像我们在开始时间中所做的那样。这需要结束时间的小时并将其拉出。因此,“1/1/18 4:34 PM”变为简单的“4”。DATEPART('hour',[End Time])

    • 但我们想指出,仍然被占用(无结束时间)的病床正在使用中,而不是空的。为此,我们将使用该函数将结束时间 24(午夜)分配给任何缺少的结束时间。如果第一个参数为 null,则计算将返回“24”。IFNULLDATEPART('hour',[End Time])

注意:想检查您的作品吗?下载 Tableau Prep 打包的流程文件 (Hospital Beds.tflx(链接在新窗口中打开)) 和 Tableau Desktop 打包工作簿文件 (Hospital Beds.twbx(链接在新窗口中打开)).

资源:需要更多培训?查看新的培训视频(链接在新窗口中打开)对于 Tableau Prep 或参加面对面培训(链接在新窗口中打开)课程。对我们介绍的功能感到好奇吗?请查看 Tableau Prep 联机帮助中的其他主题。寻找其他资源?这掌握 Tableau 准备工作,并使用此学习资源列表(链接在新窗口中打开)博客文章适合您。