SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 

Finding the Titles in R

PASS has released the videos to members from this past Summit. I say TJay Belt today ask about relating a video name to a session. I have the USB drive, so I looked on there. Here are the videos:

2018-01-10 13_25_47-Video

Not terribly helpful. If you run the HTML file from the stick, you see this:

2018-01-10 13_26_23-PASS Summit 2017

If I hover over a title, I see the link as a specific video file. For example, the first one is 65545.mp4. With that, I looked around and found a javascript file with information in it.

The structure was like this:

//SID
Col0[0] = "65073";
Col0[1] = "65091";

…

//Speaker Name
Col2[0] = "Steve Stedman";
Col2[1] = "Kellyn Pot'Vin-Gorman";

…

//Session Name
Col4[0] = "Your Backup and Recovery Strategy";
Col4[1] = "DevOps Tool Combinations for Winning Agility";

All the data is in one file, but the index in each array matches. So Col0[0] is the SID for video 65073, which has Col2[0] as the speaker and col4[0] as the title.

Now I want to get these in some sort of order. First, let me copy this data into separate files. That will make importing easier. I’ll copy the SID array into one file, the speaker array into a second file and the title array into a third.

This gives me data like the list above, but I need to clean that. This is easiest in Sublime, with a few replacements. I did

  • “COL[“ –> “”
  • “] = “ –> “,”
  • “;” –> “”

This gives me a clean file that looks like this:

2018-01-10 13_29_18-e__Documents_R_titles.txt - Sublime Text

Working in R

I almost started to move this into T-SQL and a table, but since I’ve been playing with R, I decided to see what I could do there. First, I know I need to load data, so I the first file into a data frame.

session.index = read.csv("e:\\Documents\\R\\videosid.txt", sep=",")

The column names aren’t great, so we’ll fix those:

 colnames(session.index) <- c("Index", "SessionSID")

Now
let’s get the other data.

session.speaker = read.csv("e:\\Documents\\R\\passspeaker.txt", sep=",")
> session.title = read.csv("e:\\Documents\\R\\titles.txt", sep=",") 
> colnames(session.speaker) <- c("Index", "Speaker")
> colnames(session.title) <- c("Index", "Title")

I have three data frames. I want to combine them. Let’s do that. I’ll use the merge() function to do this. Since I’ve got common column names, I’ll use those.

> pass.videos <- merge(session.index, session.title, by="Index")

> pass.videos <- merge(pass.videos, session.speaker, by="Index")

This gives me a data frame with the index, title, and speaker. Now I’ve got the data merged, let’s produce a file..

 write.table(pass.videos, file="e:\\Documents\\R\\passvideos.txt",sep=",")

With that done, I can see I have a list of video numbers, titles, and speakers.

"Index","SessionSID","Session","Speaker"
 "1",1,65091,"DevOps Tool Combinations for Winning Agility","Kellyn Pot'Vin-Gorman"
 "2",2,65092,"Oracle vs. SQL Server - The War of the Indices","Kellyn Pot'Vin-Gorman"
 "3",3,65112,"Make Power BI Your Own with the Power BI APIs","Steve Wake"

I did something in R. Smile

The Voice of the DBA

Steve Jones is the editor of SQLServerCentral.com and visits a wide variety of data related topics in his daily editorial. Steve has spent years working as a DBA and general purpose Windows administrator, primarily working with SQL Server since it was ported from Sybase in 1990. You can follow Steve on Twitter at twitter.com/way0utwest

Comments

Leave a comment on the original post [voiceofthedba.com, opens in a new window]

Loading comments...