Sabbatical August Report: My Year as a Data Scientist

This is a continuation of the series of my sabbatical reports. Here are the previous entries.

I am continuing working on the same project. I have all of the data put together (from four different databases), and now I am testing and validating the data. This essentially means that I am looking at a random sample of my data and computing the results by hand.

The skills needed to be a data scientist

I am using the same tools. I hope to talk more about more exciting tools next month, though.

I use a lot of joins, almost all of which are outer joins (either “left” or “full”). I have temporarily understood the different joins at different points in my past, but now I am fluent. I have been using joins for three (I think) purposes.

  • I use full outer joins to concatenate two data sets that contain the same information. For instance, we have one database that stores old data and one that stores news data. If I want to look at all of the data, I do a full outer join to combine them.
  • I use left joins (you could make it a right join if you like) to append new columns to a table without adding new rows. This is how I usually think of a left join.
  • I use left joins (again, you could make these right joins easily) to filter out data. This isn’t something that I thought of prior to this. Basically, I have a main table, but it has too much data. If I can create a second table that just has the rows I want, I can do “second table LEFT JOIN main table.” I don’t do this often—I am not in this situation a lot, and filtering usually works better—but I have done it.

How academia and business are different.

In academia, I get a lot of pleasure of helping students learn. This is pretty immediate, since the students are often right in front of me. Since I strongly value learning and education, I regularly see concrete ways where I help the world, albeit in small ways each time.

My experience in business has been different, which is a bit ironic. I am working for a department in the bank that helps another department in the bank collect payments from people. There are several layers between where I am and the people I am supposed to be helping (at least one department). There is also a time delay: I am working on something that won’t be used for at least a couple of months, and that is the only thing I am working on.

However, the bank lends a lot to farmers (it was voted the best bank for agriculture—and the best bank overall—in Minnesota in 2021). The bank is doing good work for society by, say, helping farmers buy equipment so that we can have food to eat. However (and I am embarrassed to say this as a mathematician), this seems a bit too abstract for me at times, and I sometimes struggle recognizing the importance of the work. But I truly believe that much of it is important—I just don’t always feel it.

How will this experience influence my teaching?

I don’t have much to say with respect to my sabbatical (unless it is subconscious), but I have been thinking a lot about labor-based grading. I am grateful to David Clark for being willing to have me bounce half-baked, rambling ideas off of him. He opened himself up to such treatment by mentioning labor-based grading in the excellent Grading for Growth Substack (with Robert Talbert).

Actually, there is one thing: it seems like there is a great demand for data-literate people in marketing. So I might push harder to get our business majors into our Data Analytics minor.

My feelings about being in industry.

My main experience right now is sadness that I am not teaching. I was briefly back on campus yesterday, and I was happy to see all of the students roaming around—and sad that I am not directly a part of it this year. This is part of the purpose of a sabbatical—to make you appreciate the great gig that you already have. Absence makes the heart grow fonder, and all.

Tags: ,

11 Responses to “Sabbatical August Report: My Year as a Data Scientist”

  1. Joss Ives Says:

    Thanks for keeping up with these updates Bret. I love the a bit too abstract comment ­čÖé

    • bretbenesh Says:

      Hi, Joss! I hope you are doing well!

      I was thinking this morning that I should have hit the “abstract” part harder in the blog, so I am glad that it stood out for you. I am really a bit embarrassed about it.

  2. Sabbatical September Report: My Year as a Data Scientist | Solvable by Radicals Says:

    […] « Sabbatical August Report: My Year as a Data Scientist […]

  3. Sabbatical October Report: My Year as a Data Scientist | Solvable by Radicals Says:

    […] August Report […]

  4. Sabbatical November Report: My Year as a Data Scientist | Solvable by Radicals Says:

    […] August Report […]

  5. Sabbatical December Report: My Year as a Data Scientist | Solvable by Radicals Says:

    […] August Report […]

  6. January Sabbatical Report: My Year as a Data Scientist | Solvable by Radicals Says:

    […] August Report […]

  7. February Sabbatical Report: My Year as a Data Scientist | Solvable by Radicals Says:

    […] August Report […]

  8. March Sabbatical Report: My Year as a Data Scientist | Solvable by Radicals Says:

    […] August Report […]

  9. April Sabbatical Report: My Year as a Data Scientist | Solvable by Radicals Says:

    […] August Report […]

  10. Solvable by Radicals Says:

    […] August Report […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: