Blog Post

Finding the Titles in R

,

PASS has released the videos to members from this past Summit. I say TJay Belt today ask about relating a video name to a session. I have the USB drive, so I looked on there. Here are the videos:

2018-01-10 13_25_47-Video

Not terribly helpful. If you run the HTML file from the stick, you see this:

2018-01-10 13_26_23-PASS Summit 2017

If I hover over a title, I see the link as a specific video file. For example, the first one is 65545.mp4. With that, I looked around and found a javascript file with information in it.

The structure was like this:

//SID
Col0[0] = "65073";
Col0[1] = "65091";
…
//Speaker Name
Col2[0] = "Steve Stedman";
Col2[1] = "Kellyn Pot'Vin-Gorman";
…
//Session Name
Col4[0] = "Your Backup and Recovery Strategy";
Col4[1] = "DevOps Tool Combinations for Winning Agility";

All the data is in one file, but the index in each array matches. So Col0[0] is the SID for video 65073, which has Col2[0] as the speaker and col4[0] as the title.

Now I want to get these in some sort of order. First, let me copy this data into separate files. That will make importing easier. I’ll copy the SID array into one file, the speaker array into a second file and the title array into a third.

This gives me data like the list above, but I need to clean that. This is easiest in Sublime, with a few replacements. I did

  • “COL[“ –> “”
  • “] = “ –> “,”
  • “;” –> “”

This gives me a clean file that looks like this:

2018-01-10 13_29_18-e__Documents_R_titles.txt - Sublime Text

Working in R

I almost started to move this into T-SQL and a table, but since I’ve been playing with R, I decided to see what I could do there. First, I know I need to load data, so I the first file into a data frame.

session.index = read.csv("e:\\Documents\\R\\videosid.txt", sep=",")

The column names aren’t great, so we’ll fix those:

 colnames(session.index) <- c("Index", "SessionSID")

Now

let’s get the other data.

session.speaker = read.csv("e:\\Documents\\R\\passspeaker.txt", sep=",")
> session.title = read.csv("e:\\Documents\\R\\titles.txt", sep=",") 
> colnames(session.speaker) <- c("Index", "Speaker")
> colnames(session.title) <- c("Index", "Title")

I have three data frames. I want to combine them. Let’s do that. I’ll use the merge() function to do this. Since I’ve got common column names, I’ll use those.

> pass.videos <- merge(session.index, session.title, by="Index")
> pass.videos <- merge(pass.videos, session.speaker, by="Index")

This gives me a data frame with the index, title, and speaker. Now I’ve got the data merged, let’s produce a file..

 write.table(pass.videos, file="e:\\Documents\\R\\passvideos.txt",sep=",")

With that done, I can see I have a list of video numbers, titles, and speakers.

"Index","SessionSID","Session","Speaker"
 "1",1,65091,"DevOps Tool Combinations for Winning Agility","Kellyn Pot'Vin-Gorman"
 "2",2,65092,"Oracle vs. SQL Server - The War of the Indices","Kellyn Pot'Vin-Gorman"
 "3",3,65112,"Make Power BI Your Own with the Power BI APIs","Steve Wake"

I did something in R. Smile

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating