SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Anyone using Python?


Anyone using Python?

Author
Message
xsevensinzx
xsevensinzx
SSC-Insane
SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)

Group: General Forum Members
Points: 20103 Visits: 5407
Man, I always come by here and never see anything. I'm always tooting how great Python is in these forums so I figured I would make the first post here.

Question

Anyone actually using Python for machine learning or well, anything relating to analytics, BI and statistics?

I have for some time, but only for data ingestion (e.g.: API). I have been trying to sharpen up my scripting skills to include more machine learning techniques with Python. I have a growing library of Python scripts that have been collected from a number of data science & machine learning courses with a focus on Python over R.

Lately, I've been doing hacker rank problems to also sharpen my skills. I ran into this problem where I think Python solves the challenge, but not fast enough to meet the requirements on larger data sets. If anyone has any insights, please feel free to jump in below.

Problem I've Been Working On

Given two numbers NN and MM. NN indicates the number of elements in the array and MM indicates number of queries. You need to perform two types of queries on the array. You are given MM queries. Queries can be of two types, type 1 and type 2.


    Type 1 queries are represented as 1 i j : Modify the given array by removing elements from ii to jj and adding them to the front.

    Type 2 queries are represented as 2 i j : Modify the given array by removing elements from ii to jj and adding them to the back.


Your task is to simply print |A[1]−A[N]||A[1]−A[N]| of the resulting array after the execution of MM queries followed by the resulting array. Note While adding at back or front the order of elements is preserved.

Input Format

First line consists of two space-separated integers, NN and MM. Second line contains NN integers, which represent the elements of the array. MM queries follow. Each line contains a query of either type 1 or type 2 in the form type i j

Sample Input

8 4
1 2 3 4 5 6 7 8
1 2 4
2 3 5
1 4 7
2 1 4

Explanation

Given array is {1,2,3,4,5,6,7,8}
After execution of query 1 2 4, the array becomes {2,3,4,1,5,6,7,8}
After execution of query 2 3 5, the array becomes {2,3,6,7,8,4,1,5}
After execution of query 1 4 7, the array becomes {7,8,4,1,2,3,6,5}
After execution of query 2 1 4, the array becomes {2,3,6,5,7,8,4,1}
Now |A[1]−A[N]||A[1]−A[N]| is |(2−1)||(2−1)| i.e. 11 and the array is 23657841

Data

input_data.txt attached.

My Code Submission

This is too slow.. Sad


def array_query(s, t, i, j):
""" Type 1 Query that modifies the given array by removing elements from i to j and adding them to the front.

:param i: N starting number for range
:param j: N ending number for range
:return : New array with the chopped number ranges specified from i and j.
"""

if t == 1:
s[:0] = s[i-1:j]
del(s[i+(j-i):j+(j-i+1)])
elif t == 2:
s += s[i-1:j]
del(s[i-1:j])

return s

if __name__ == "__main__":
input_size = map(int, raw_input().split()) # Array size, Total Queries
source_array = map(int, raw_input().split()) # Input array

for case in range(0, input_size[1]):
test_case = map(int, raw_input().split())
array_query(source_array, # a[ ] = Input Array
test_case[0], # q[0] = Query type
test_case[1], # q[1] = i range (Min)
test_case[2]) # q[2] = j range (Max)

print abs(source_array[0] - source_array[input_size[0] - 1])
print ' '.join(map(str, source_array))


Attachments
input_data.txt (239 views, 1.00 MB)
Steve Jones
Steve Jones
SSC Guru
SSC Guru (582K reputation)SSC Guru (582K reputation)SSC Guru (582K reputation)SSC Guru (582K reputation)SSC Guru (582K reputation)SSC Guru (582K reputation)SSC Guru (582K reputation)SSC Guru (582K reputation)

Group: Administrators
Points: 582278 Visits: 20875
Interesting. I'll have to play with this. No time now, but I've done some Python, and have been working with the SDTIG on their Python series. I've gotten behind, but they had an interesting machine learning segment, with the last meeting this Wed.

http://www.meetup.com/San-Diego-Technology-Immersion-Group-SDTIG/

I think I'd do something similar to your approach. String slicing makes sense, though probably not super fast.

Do you mean slow in terms of lots of input rows, wide strings, or both?

Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
xsevensinzx
xsevensinzx
SSC-Insane
SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)

Group: General Forum Members
Points: 20103 Visits: 5407
Steve Jones - SSC Editor (4/24/2016)
Interesting. I'll have to play with this. No time now, but I've done some Python, and have been working with the SDTIG on their Python series. I've gotten behind, but they had an interesting machine learning segment, with the last meeting this Wed.

http://www.meetup.com/San-Diego-Technology-Immersion-Group-SDTIG/

I think I'd do something similar to your approach. String slicing makes sense, though probably not super fast.

Do you mean slow in terms of lots of input rows, wide strings, or both?



This is slow because the bigger the phone book, the longer it takes so to speak. I need a better approach that scales to meet the 10 second minimum requirement. The attached data actually completes within 12 seconds on my machine. It's not that slow, but not fast enough. The below recommendation is one way to tackle the problem.

The recommended approach is by doing a Treap. This is similar to a binary search tree, but a self-balancing binary search tree. I've seen examples in C++, but not too much in Python without modules. Trying to do this without a module. Guess I can manually do it.

Treap


Bosko Vukov
Bosko Vukov
SSC Veteran
SSC Veteran (277 reputation)SSC Veteran (277 reputation)SSC Veteran (277 reputation)SSC Veteran (277 reputation)SSC Veteran (277 reputation)SSC Veteran (277 reputation)SSC Veteran (277 reputation)SSC Veteran (277 reputation)

Group: General Forum Members
Points: 277 Visits: 549
Hi, i think you can speed up your list "slice and dice" if you convert list to (byte) string using struct module and pack/unpack.


from struct import *

def string_query(s, t, i, j):
""" Type 1 Query that modifies the given array by removing elements from i to j and adding them to the front.
:param i: N starting number for range
:param j: N ending number for range
:return : New array with the chopped number ranges specified from i and j.
"""

if t == 1:
return s[ 4*(i-1): 4*j ] + s[ : 4*(i-1) ] + s[ 4*j : ]
else:
return s[ : 4*(i-1) ] + s[ 4*j : ] + s[ 4*(i-1): 4*j ]

if __name__ == "__main__":
ifl = open('input_data.txt','r')

isz = map(int, ifl.readline().split()) # Array size, Total Queries
sa = map(int, ifl.readline().split()) # Input array
lsa = len(sa)
sa = ''.join( map( lambda x: pack('i', x ), sa ))
for case in range(0, isz[1]):
tc = map(int, ifl.readline().split())
sa = string_query( sa, # a[ ] = Input Array
tc[0], # q[0] = Query type
tc[1], # q[1] = i range (Min)
tc[2]) # q[2] = j range (Max)
sa = unpack( 'i' * lsa, sa )
print abs(sa[0] - sa[isz[0] - 1])
print ' '.join(map(str, sa))


xsevensinzx
xsevensinzx
SSC-Insane
SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)

Group: General Forum Members
Points: 20103 Visits: 5407
They are strict on module usage. Don't think I can use it. Let's see.
xsevensinzx
xsevensinzx
SSC-Insane
SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)

Group: General Forum Members
Points: 20103 Visits: 5407
Well then, I guess struct was good to go. Amazing how something as simply taking an approach to pack and unpack the data allowed me to reduce the runtime below 10 seconds on all 27 test cases.

To compare the results. I went from 7.31 seconds on the last good test case with the previous code to over 10 seconds to fail the rest of the test cases.

Using your suggestion, I was able to solve all the test cases under 10 seconds, with the longest running time 8.39 seconds.

:-D
Bosko Vukov
Bosko Vukov
SSC Veteran
SSC Veteran (277 reputation)SSC Veteran (277 reputation)SSC Veteran (277 reputation)SSC Veteran (277 reputation)SSC Veteran (277 reputation)SSC Veteran (277 reputation)SSC Veteran (277 reputation)SSC Veteran (277 reputation)

Group: General Forum Members
Points: 277 Visits: 549
:-) https://www.hackerrank.com/challenges/array-and-simple-queries ...
I hope you get that job.
xsevensinzx
xsevensinzx
SSC-Insane
SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)

Group: General Forum Members
Points: 20103 Visits: 5407
Bosko Vukov (4/27/2016)
:-) https://www.hackerrank.com/challenges/array-and-simple-queries ...
I hope you get that job.


Pssh, no job attached to this. I do these for fun and stream them on TwitchTv to sharpen my skills! ;-)
Steve Jones
Steve Jones
SSC Guru
SSC Guru (582K reputation)SSC Guru (582K reputation)SSC Guru (582K reputation)SSC Guru (582K reputation)SSC Guru (582K reputation)SSC Guru (582K reputation)SSC Guru (582K reputation)SSC Guru (582K reputation)

Group: Administrators
Points: 582278 Visits: 20875
Simpler, but I've done some of these: http://exercism.io/

Python is built to use modules. Trying to not use them, in some sense, defeats the purpose. I guess it gains algorithmic practice, but perhaps leads you to spend time on items that could better be done in other ways.

I guess it's a balance of some sort.

Follow me on Twitter: @way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com
xsevensinzx
xsevensinzx
SSC-Insane
SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)SSC-Insane (20K reputation)

Group: General Forum Members
Points: 20103 Visits: 5407
Steve Jones - SSC Editor (4/27/2016)
Simpler, but I've done some of these: http://exercism.io/

Python is built to use modules. Trying to not use them, in some sense, defeats the purpose. I guess it gains algorithmic practice, but perhaps leads you to spend time on items that could better be done in other ways.

I guess it's a balance of some sort.


It's likely because a lot of modules do all the work for you. They are trying to restrict that and see what you can do with the bare essentials for learning purposes. Having to always lean on a module will surely hinder you in solving these complex problems.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum








































































































































































SQLServerCentral


Search