Answer The Questions About The Following Function X 5 X-3 SQL Window Functions on Data Science Interviews Asked By Airbnb, Netflix, Twitter, and Uber

You are searching about Answer The Questions About The Following Function X 5 X-3, today we will share with you article about Answer The Questions About The Following Function X 5 X-3 was compiled and edited by our team from many sources on the internet. Hope this article on the topic Answer The Questions About The Following Function X 5 X-3 is useful to you.

SQL Window Functions on Data Science Interviews Asked By Airbnb, Netflix, Twitter, and Uber

Window functions are a group of functions that perform calculations on a set of lines relative to your current line. They are considered advanced sql and are often asked in data science interviews. It is also used at work to solve many different types of problems. Let’s summarize the 4 different types of window functions and cover why and when you should use them.

4 Types of Window Functions

1. Regular aggregate functions

o These are aggregates such as AVG, MIN/MAX, COUNT, SUM

o You want to use it to aggregate your data and group it in another column like month or year

2. Ranking functions

or ROW_NUMBER, RANK, RANK_DENSE

o These are functions that help you rank your data. You can rank your entire dataset or rank it by groups such as month or country

o Very useful for creating rank indexes within groups

3. Generate statistics

o This is great if you need to do simple statistics like NTILE (percentiles, quartiles, median)

o You can use it for your entire dataset or a group

4. Time series data management

o A common window function especially if you need to calculate trends such as a month-to-month rolling average or a growth metric

o LAG and LEAD are two functions that allow you to do this.

1. Regular aggregate function

Regular aggregate functions are functions like average, count, sum, min/max applied to columns. The goal is to use the aggregate function when you want to apply aggregations to different dataset groups, such as month.

This is similar to the type of calculation that an aggregate function can do that you’d find in a SELECT clause, but unlike regular aggregate functions, window functions don’t group multiple rows into a single output row. , they are grouped together or keep their own identities, depending on how you look for them.

Avg() Example:

Let’s look at an example of an avg() window function implemented to answer a data analytics question. You can view the question and write the code in the link below:

platform.stratascratch.com/coding-question?id=10302&python=

This is a perfect example of using a window function and then applying an avg() to a month group. Here we try to calculate the average distance per dollar of the month. This is difficult to do in SQL without this window function. Here we apply the avg() window function in the 3rd column where we find the average value for the month-year for each month-year of the dataset. We can use this metric to calculate the difference between the month average and the date average for each request date in the table.

The code to implement the window function looks like this:

SELECT a.request_date,

a.dist_to_cost,

AVG(a.dist_to_cost) OVER(PARTITION BY a.request_mnth) AS avg_dist_to_cost

FROM

(SELECT *,

to_char(request_date::date, ‘YYYY-MM’) AS request_mnth,

(travel_distance/money_cost) AS dist_to_cost

FROM uber_request_logs) a

ORDER BY request_date

2. Ranking Activities

Ranking functions are an important tool for a data scientist. You are constantly ranking and indexing your data to better understand which rows perform best in your dataset. The SQL window functions provide you with 3 ranking utilities — RANK(), DENSE_RANK(), ROW_NUMBER() — depending on your exact use case. These functions help you list your data in order and in groups based on your preferences.

Rank() Example:

Let’s look at a rank window function example to see how we can rank data within groups using SQL window functions. Follow the interactive method at this link: platform.stratascratch.com/coding-question?id=9898&python=

Here we want to find the highest salary in the department. We can’t just find the top 3 salaries without the window function because it will only give us the top 3 salaries of all departments, so we have to rank the salaries of the departments individually. This is done by rank() and divided by department. From there it’s very easy to filter for the top 3 in all departments

Here is the code to output this table. You can copy and paste the SQL editor in the link above and see the same output.

SELECT department,

salary,

RANK() OVER (PARTITION OF a.department

ORDER BY a.salary DESC) AS rank_id

FROM

(SELECT department, salary

FROM twitter_employee

GROUP IN department, salary

ORDER OF department, salary) a

ORDERING department,

salary DESC

3. NTILE

NTILE is a very useful function for those in data analytics, business analytics, and data science. Often times the deadline with statistical data, you probably need to create strong statistics such as quartile, quintile, median, decile in your daily work, and NTILE makes it easy to do it. outputs.

NTILE takes an argument of the number of bins (or generally how many buckets you want to divide your data into), and then creates this number of bins by dividing your data into a large number of bins. You set how to order and divide the data, if you want more groups.

NTILE(100) For example

In this example, we will learn how to use NTILE to categorize our data into percentages. You can follow along interactively at the link here: platform.stratascratch.com/coding-question?id=10303&python=

What you’re trying to do here is identify the top 5 percent of claims based on the single-point algorithm outputs. But you can’t find the top 5% and make an order because you want to find the top 5% in the state. So one way to do this is to use the NTILE() ranking function and then PARTITION the state. You can use a filter in the WHERE clause to get the top 5%.

Here is the code to output the entire table above. You can copy and paste it from the link above.

SELECT policy_num,

state,

claim_cost,

fraud_score,

percent

FROM

(SELECT *,

NTILE(100) OVER(PARTITION BY state

ORDER BY fraud_score DESC) AS percentile

FROM fraud_score) a

WHERE percentage <=5

4. Time series data management

LAG and LEAD are two window functions that are useful for dealing with time series data. The only difference between LAG and LEAD is whether you want to retrieve from previous rows or following rows, almost like sampling from previous data or future data.

You can use LAG and LEAD to calculate month-over-month growth or rolling averages. As a data scientist and business analyst, you often deal with time series data and create time metrics.

LAG() Example:

In this example, we want to find the percentage growth per year, which is a common question that data scientists and business analysts answer every day. The problem statement, data, and SQL editor are available at the following link if you want to try coding the solution yourself: platform.stratascratch.com/coding-question?id=9637&python=

The trick with this problem is that the data is set up — you have to use the value of the previous row in your metric. But SQL wasn’t built to do that. SQL is built to calculate anything you want as long as the values ​​are in the same row. So we can use the lag () or lead () window function that will take the previous or next rows and put them in your current row which is what this query does.

Here is the code to output the entire table above. You can copy and paste the code in the SQL editor in the link above:

SELECT year,

current_year_host,

prev_year_host,

round(((current_year_host – prev_year_host)/(cast(prev_year_host AS numeric)))*100) estimated_growth

FROM

(SELECT year,

current_year_host,

LAG(current_year_host, 1) OVER (ORDER BY year) AS prev_year_host

FROM

(SELECT extract (year

FROM host_since::date) AS year,

count(id) now_year_host

FROM airbnb_search_details

WHERE host_since IS NOT NULL

GROUP BY extract(year

FROM host_since::date)

ORDER BY year) t1) t2

Video about Answer The Questions About The Following Function X 5 X-3

You can see more content about Answer The Questions About The Following Function X 5 X-3 on our youtube channel: Click Here

Question about Answer The Questions About The Following Function X 5 X-3

If you have any questions about Answer The Questions About The Following Function X 5 X-3, please let us know, all your questions or suggestions will help us improve in the following articles!

The article Answer The Questions About The Following Function X 5 X-3 was compiled by me and my team from many sources. If you find the article Answer The Questions About The Following Function X 5 X-3 helpful to you, please support the team Like or Share!

Rate Articles Answer The Questions About The Following Function X 5 X-3

Rate: 4-5 stars
Ratings: 5417
Views: 10031524

Search keywords Answer The Questions About The Following Function X 5 X-3

Answer The Questions About The Following Function X 5 X-3
way Answer The Questions About The Following Function X 5 X-3
tutorial Answer The Questions About The Following Function X 5 X-3
Answer The Questions About The Following Function X 5 X-3 free
#SQL #Window #Functions #Data #Science #Interviews #Asked #Airbnb #Netflix #Twitter #Uber

Source: https://ezinearticles.com/?SQL-Window-Functions-on-Data-Science-Interviews-Asked-By-Airbnb,-Netflix,-Twitter,-and-Uber&id=10395728

Related Posts

default-image-feature

Focus Group Researchers Use Open-Ended Questions To Elicit Participants Thoughts Market Research – Why Screening For Talkative Respondents Doesn’t Work

You are searching about Focus Group Researchers Use Open-Ended Questions To Elicit Participants Thoughts, today we will share with you article about Focus Group Researchers Use Open-Ended…

default-image-feature

Five Questions On Using Instagram And/Or Other Photo Sharing Sites. Social Media Platform Review

You are searching about Five Questions On Using Instagram And/Or Other Photo Sharing Sites., today we will share with you article about Five Questions On Using Instagram…

default-image-feature

Finding M-Step Like Questions To Use In The Classroom Everyday How to Success On The Job from Job Hunting to Keep Your Job and Get Most of Out of It

You are searching about Finding M-Step Like Questions To Use In The Classroom Everyday, today we will share with you article about Finding M-Step Like Questions To…

default-image-feature

Find F-1 X For The Function Given In Question 1 Aerospace, Post WW II – Greatest Half-Century of Achievement – And North American Aviation’s Role!

You are searching about Find F-1 X For The Function Given In Question 1, today we will share with you article about Find F-1 X For The…

default-image-feature

Fact-Finding Feeling-Finding And Tell-Me-More Are Not Questioning Formats Are You a Closet Mystic? Find Out If You Have Mystical Tendencies

You are searching about Fact-Finding Feeling-Finding And Tell-Me-More Are Not Questioning Formats, today we will share with you article about Fact-Finding Feeling-Finding And Tell-Me-More Are Not Questioning…

default-image-feature

Examples Of Non-Western Indigenous Healing Practices That We Might Question Creating a Culture of Solidarity, Encounter and Relationship In Laudato Si by Pope Francis

You are searching about Examples Of Non-Western Indigenous Healing Practices That We Might Question, today we will share with you article about Examples Of Non-Western Indigenous Healing…