Table Joins

Understanding Table Joins in SQL

Working with data often involves the need to utilize multiple data sources, usually stored in different data tables (in case of database storage) or data frames (when it comes to programming languages or data visualization tools.)  In order to put power of this data to a good use we want to be able to join these tables based on a field or fields they have in common (foreign key[s]) or sometimes values in the field that are different. Not only basic principles of table joins – INNER, OUTER (FULL, LEFT, and RIGHT), CROSS (or Cartesian) or even UNION-ing tables are universal to most relational databases and flavors of SQL, they also apply to working with data frames. In this post we will explore examples of using these table joins in a PostgreSQL database, while adding SELF, and LEFT/RIGHT exclusive joins for a good measure.

Continue reading

US COVID-19 Cases

During these uncertain times, how can you make sense of the data tsunami being presented on the state of pandemic in US? For the last couple of months, many Americans found themselves checking the spread of COVID-19 cases on a daily basis. As most of US states went into shelter-in-place mode, resources like Johns Hopkins and 91-DIVOC became a daily refuge for those seeking to stay informed. In today’s post, we will work on creating our own version of a web-based, interactive and visually appealing COVID-19 dashboard using Google DataStudio. Doing so we will gain a better understanding of the data used, decide on the type of data we deem most relevant, and maintain control over the best ways to visualize such data to help our audience make most sense of it. In the process of building this data viz, we will utilize various objects and features of the mighty GDS application: Google Sheets connector, Calculated fields, Scorecard, Table, Geo Map, Line and Combo charts, Date range, Filter controls and recently released optional metrics – are some but not all features we will cover.

Continue reading

XLOOKUP Function

I personally don’t know why it took this long for Microsoft Excel team to create XLOOKUP function. The fact that VLOOKUP is considered to be one of Excel’s most widely used functions reflects a strong demand in string look up tabulations. Surely, a multitude of VLOOKUP‘s limitations can be overcome with patience, helper columns, INDEX/MATCH, CHOOSE, OFFSET and other constructs. Yet, why would we use any workarounds, when we would rather utilize a more powerful function with multitude of applications? Meet, much anticipated XLOOKUP function, which was officially released to Office 365 subscribers in early February of this year. It offers a really long list of additional benefits; in today’s tutorial we will review 11 scenarios that take full advantage of the following XLOOKUP features:

  • LEFT lookup
  • Horizontal lookup
  • Multi-cell/array retrieval
  • Match based on wildcard conditions
  • Combination of Vertical AND Horizontal lookups
  • Lookup based on multiple criteria
  • Lookup in reverse order
  • Lookup for maximum/minimum values
  • Built-in Error Handling
  • Exact match by default
  • Flexible approximate match

Continue reading

Google Dataset Search

Google has been dominating web search for nearly two decades and it’s acquisition of YouTube resulted in the second most popular search engine in the world. Yet, it seemingly lost the product search niche to Amazon. It’s not surprising that amidst growing interest in all things data, including public and open data, this tech giant would be keen on developing a search product geared towards making dataset search easier. What is surprising, is how long it took them to develop and release this product, which was officially introduced to general public on January 23rd, 2020 after spending more than 16 months in beta testing. You can embark on your own dataset search journey here.

Continue reading