A beginner’s Guide to BigQuery Sandbox and exploring public datasets.

A beginner's guide to BigQuery Sandbox

          As you might realize by now, writing SQL queries is one of the essential skills any inspiring data analyst needs to master. After all, larger datasets are typically stored in relational databases and Structured Query Language is the language that helps us communicate with such databases. Sure, NoSQL is gaining prominence amid the growing popularity of nontraditional databases, but we need to learn to crawl before we start walking. Merely 10 years ago, you would need to download and install a RDBMS software package (be it MySQL, PostgreSQL, or SQLite), load a sample database and do a hundred pushups before you could write your very first SQL query. Luckily technology sprung ahead and we now have a plethora of web-based SQL editor options from SQL Lite Online to SQL Fiddle that eliminate the software setup step, but might still require us to load sample data. What if you wanted to access real-world big data sets from the comfort of your browser without having to download any software, no hassle, no trial, no credit card required? Well, you’re in luck, what follows is the beginner’s guide to Google BigQuery’s Sandbox. An active Google account is your cost of admission. BONUS: Machine Learning models are powered by nothing else but SQL are also included.
Continue reading

7 Ways to conditionally calculate sum of values in Excel.


Conditional SUM functions in  Excel

         Excel offers different ways to accomplish the same task. This is especially evident in the case of using Excel functions, where we can simply choose the one that offers the best solution, or more realistically, the one that we are more comfortable using. As an example, let’s solve the following scenario: We are offering online Excel courses both: on our internal website, as well as on Udemy’s platform. Udemy charges us 50% fee on all course sales, and also offers promotional rates to increase our volume. As a result, we are selling the same content at different prices. Looking at Thanksgiving week sales performance, let’s calculate our total Net Sales for all of Udemy transactions (highlighted). Let’s use different Excel functions to perform calculation required.

Continue reading






Solving ModelOff Data Analysis problem using Microsoft Access SQL.

MS Access Solution for ModelOff 2013 Data Analysis problem

         Last week we solved ModelOff’s Data Analysis problem from their 2013 championship. Since the second round of 2014 Model Off competition takes place this Saturday, November 8th, let’s pay respect to the data superheroes making it thus far. Our previous ModelOff solution involved using PivotTable feature of Microsoft Excel. Would you believe that we can realistically conceive a solution to the Data analysis problem, using Microsoft Access, or even better, Microsoft’s flavor of the SQL language?

         I am as big of an Excel fan as the next guy, much bigger, actually, on the second thought. However, I also believe that when possible, using the right tools for the job will yield better, faster results, than duct-taping your workarounds. So, why Access? Why on Earth, SQL? Well, let’s go to the source: ” If you often have to view your data in a variety of ways, depending on changing conditions or events, Access might be the better choice for storing and working with your data. Access lets you use Structured Query Language (SQL) queries to quickly retrieve just the rows and columns of data that you want, whether the data is contained in one table or many tables. You can also use expressions in queries to create calculated fields.”
Continue reading