Skip to content

Commit a44a775

Browse files
committed
resolve issue #6
1 parent 722f793 commit a44a775

File tree

1 file changed

+5
-5
lines changed

1 file changed

+5
-5
lines changed

_episodes/03-data-wrangling.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -114,16 +114,16 @@ Our "id" columns are `country` and `continent`. Our "value" columns are all of t
114114

115115
~~~
116116
cols = list(df.columns)
117-
cols.remove('continent')
118-
cols.remove('country')
117+
cols.remove("continent")
118+
cols.remove("country")
119119
cols
120120
~~~
121121
{: .language-python}
122122

123123
Now, we can call `pd.melt()` and pass `cols` rather than typing out the whole list.
124124

125125
~~~
126-
df_melted = pd.melt(df, id_vars=['country', 'continent'], value_vars = cols)
126+
df_melted = pd.melt(df, id_vars=["country", "continent"], value_vars = cols)
127127
df_melted
128128
~~~
129129
{: .language-python}
@@ -144,7 +144,7 @@ Now that we have melted our datset, we can address another untidy problem: "Mult
144144
Take a closer look at the `variable` column. This column contains two pieces of information - the metric and the year. Thankfully, these former column names have a consistent naming scheme, so we can easily split these two pieces of information into two different columns.
145145

146146
~~~
147-
df_melted[['metric', 'year']] = df_melted['variable'].str.split("_", expand=True)
147+
df_melted[["metric", "year"]] = df_melted["variable"].str.split("_", expand=True)
148148
df_melted
149149
~~~
150150
{: .language-python}
@@ -161,7 +161,7 @@ df_melted
161161
Now that all of our columns contain the appropriate information, in a tidy/long format, it's time to save our dataframe back to a CSV file. But first, let's clean up our datset: we're going to re-order our columns (and remove the now extra `variable` column) and sort the rows.
162162

163163
~~~
164-
df_final = df_melted[['country', 'continent', 'year', 'metric', 'value']]
164+
df_final = df_melted[["country", "continent", "year", "metric", "value"]]
165165
df_final = df_final.sort_values(by=["continent", "country", "year", "metric"])
166166
df_final
167167
~~~

0 commit comments

Comments
 (0)