What Python developers need to know before migrating to Go(lang)

This is a (long) blog post about our experience at Repustate in migrating a big chunk of code from Python/Cython to Go. If you want to read the whole story, background and all, read on. If you’re interested in just what Python developers need to know before taking the plunge, click the link below.

Tips & tricks in migrating from Python to Go.

The Background

One of the best technological feats that we’ve done here at Repustate was implementing Arabic sentiment analysis. Arabic is one tough nut to crack because of the complex morphological forms Arabic words can take. Tokenization (splitting a sentence up into individual words) is also tougher in Arabic than in say, English, because Arabic words can contain whitespace within the word itself (e.g. the position of ‘aleph’ within a word). Without giving away our secret recipe, Repustate uses support vector machines (SVM) to come up with the most likely meaning behind a sentence and then apply sentiment to that. In total, we use 22 models (i.e 22 SVMs) and each word in a document is analyzed. So if you have 500 words in a document, that’s more than 10,000 comparisons against the SVMs.

Python

Repustate is almost entirely a Python shop; we use Django for the API and website. So it only made sense (at the time) to keep the code base homogenous and implement all of the Arabic sentiment engine in Python as well. As far as prototyping and implementing goes, Python is hard to beat. Very expressive, awesome 3rd party libraries etc. If you’re serving up web pages, it’s perfect. But when you’re doing low level computations, doing lots of comparisons against hash tables (dictionaries in Python), things get slow. We were able to process about 2-3 Arabic documents per second, which is too slow. By comparison, our English language sentiment engine can do about 500 per second.

The Bottleneck

So we fired up the Python profiler and began investigating what was taking so long. Remember above how I said we have 22 SVMs and each word passes through it? Well that was all done in serial, not in parallel. OK, our first thought was to change to this to a map/reduce like operator. TL;DR: The map/reduce idiom stinks in Python. When you need concurrency, Python is just not your friend. At PyCon 2013, Guido spoke about Tulip, his new project that was hoping to remedy this, but that’s not due out for a while, and why wait there’s already something better out there.

Golang or go home

My friend at Mozilla told me that Mozilla Services was switching over to Go for much of their logging infrastructure, in part because of the awesomeness of goroutines. Go was designed by the folks at Google and it was designed with concurrency as a first-class notion, not an afterthought, as Python’s various solutions are. So we went about making the change from Python to Go.

While the Go code is not yet in production, the results are ridiculously encouraging. We’re doing 1000 documents/s now, using WAY less memory, and not having to debug ugly multiprocess/gevent/”why won’t Control-C kill my process” code that you get in Python.

Why we love Go

Anyone who has a bit of an understanding of how programming languages work (interpreted vs. compiled, dynamic vs. static) will say, “Well duh, obviously Go is faster”. Yeah, we could have re-written the whole thing in Java and seen similar improvements, but that’s not why Go is such a winner. The code you write with Go just seems to be correct. I can’t really put my finger on it, but somehow once the code compiled (and it compiles QUICKLY), you just get the feeling that it’ll work (not just run without error, but even logically be correct). I know, that sounds very wishy-washy, but it’s true. It’s very similar to Python in terms of verbosity (or lack thereof) and it treats functions as first-class objects, so functional programming is easy to reason about. And of course, goroutines and channels make your life so much easier. So you get the performance boost of static typing and having finer control over memory allocation but you don’t forfeit too much in expressiveness.

Things we wish we knew

With all the compliments out of the way, you really do need a different mindset at times when dealing with Go compared to Python. So here’s a list of notes I kept as the migration took place – just random things that popped into my head when converting Python code to Go:

  • No built-in type for sets (have to use maps and test for existence)
  • In absence of sets, have to write your own intersection, union etc. methods
  • No tuples, have to write your own structs or use slices (arrays)
  • No __getattr__() like method, so you have to always check for existence rather than setting defaults e.g. in Python you can do value = dict.get(“a_key”, “default_value”)
  • Having to always check errors (or at least explicitly ignore them)
  • Can’t have variables/packages that aren’t used so to test simple things requires sometimes commenting out lines
  • Going between []byte and string. regexp uses []byte (they’re mutable). It makes sense, but it’s annoying all the same having to cast & re-cast some variables.
  • Python is more forgiving. You can take slices of strings using indexes that are out of range and it won’t complain. You can take negative slices – not Go.
  • You can’t have mixed type data structures. Maybe it’s not kosher, but sometimes in Python I’ll have a dictionary where the values are a mix of strings and lists. Not in Go, you have to either clean up your data structures or define custom structs  Thanks to Ralph Corderoy for showing me how to do this properly (use the interface, Luke). http://play.golang.org/p/SUgl7wd9tk
  • No unpacking of a tuple or list into separate variables (e.g. x,y,x = [1,2,3])
  • UpperCamelCase is the convention (if you don’t have a title case on the function/struct in a package it won’t be exposed to other packages). I like Python’s lower_case_with_underscores more.
  • Have to explicitly check if errors are != nil, unlike in Python where many types can be used for bool-like checks (0, “”, None can all be interpreted as being “not” set)
  • Documentation on some modules (e.g. crypto/md5) is sparse BUT go-nuts on IRC is awesome, really great support available
  • Type casting from number to string (int64 -> string) is different than going from []byte -> string (just use string([]byte)). Need to use strconv.
  • Reading Go code is definitely more like a programming language whereas Python can be written as almost pseudocode. Go has more non-alphanumeric characters and uses || and && instead of “or” and “and”.
  • Writing to a file, there’s File.Write([]byte) and File.WriteString(string) – a bit of a departure for Python developers who are used to the Python zen of having one way to do something
  • String interpolation is awkward, have to resort to fmt.Sprintf a lot
  • No constructors, so common idiom is to create NewType() functions that return the struct you want
  • Else (or else if) has to be formatted properly, where the else is on the same line as the curly bracket from the if clause. Weird.
  • Different assignment operator is used depending on whether you are inside & outside of function ie. = vs :=
  • If I want a list of just the keys or just the value, as in dict.keys() or dict.values(), or a list of tuples like in dict.items(), there is no equivalent in Go, you have to iterate over maps yourself and build up your list
  • I use an idiom at times of having a dictionary where the values are functions that I want to invoke given a key. You can do this in Go, but all functions have to accept & return the same thing i.e. have the same method signature
  • If you’re using JSON and your JSON is a mix of types, goooooood luck. You’ll have to create a custom struct that matches the format of your JSON blob, and then Unmarshall the raw json into an instance of your custom struct. Much more work than just obj = json.loads(json_blob) like we’re used to in Python land.

Was it worth it?

Yes, a million times, yes. The speed boost is just too good to pass up. Also, and this counts for something I think, Go is a trendy language right now, so when it comes to recruiting, I think having Go as a critical part of Repustate’s tech stack will help.

309 thoughts on “What Python developers need to know before migrating to Go(lang)”

  1. Pingback: Google
  2. Pingback: Astrology
  3. Pingback: fuck google
  4. Pingback: fuck google
  5. Pingback: fuck google
  6. Pingback: fuck google
  7. Pingback: sites
  8. Pingback: fuck google
  9. Pingback: fuck google
  10. Pingback: pornolar
  11. Pingback: canlı maç izle
  12. Pingback: porno izle
  13. Pingback: fuck google
  14. Pingback: fuck google
  15. Pingback: fake taxi
  16. Pingback: 他媽的谷歌
  17. Pingback: 他妈的谷歌
  18. Pingback: girne üniversite
  19. Pingback: fleet graphics
  20. Pingback: star scan
  21. Pingback: bantningspiller
  22. Pingback: 他妈的谷歌
  23. Pingback: auctions near me
  24. Pingback: Geek News
  25. Pingback: coffee from kona
  26. Pingback: 他妈的谷歌
  27. Pingback: 他妈的谷歌
  28. Pingback: 他妈的谷歌
  29. Pingback: 觀看色情肛門
  30. Pingback: peaberry coffee
  31. Pingback: 我他媽的谷歌
  32. Pingback: home finder
  33. Pingback: istanbul escort
  34. Pingback: Web Wealth System
  35. Pingback: agen bola
  36. Pingback: great post to read
  37. Pingback: Hot deals
  38. Pingback: work to home
  39. Pingback: How to articles
  40. Pingback: Broke iPhone
  41. Pingback: check my blog
  42. Pingback: Piece Of Heaven
  43. Pingback: fukuidatumou
  44. Pingback: fukuidatumou
  45. Pingback: 他妈的谷歌
  46. Pingback: 他妈的谷歌
  47. Pingback: greek iptv
  48. Pingback: 体育博彩
  49. Pingback: coffee beans inc
  50. Pingback: hawaiian coffee
  51. Pingback: fish tanks
  52. Pingback: check that
  53. Pingback: program indir
  54. Pingback: Seo Norge
  55. Pingback: Buy Followers
  56. Pingback: oyunlar indir
  57. Pingback: orlando escorts
  58. Pingback: multilingual seo
  59. Pingback: vybe beatz
  60. Pingback: Make Money Online
  61. Pingback: 他妈的谷歌
  62. Pingback: موبايلات
  63. Pingback: fuckkkk google
  64. Pingback: best coffee
  65. Pingback: apparel
  66. Pingback: whole coffee beans
  67. Pingback: homegoods
  68. Pingback: descargar juegos
  69. Pingback: Lawyer polygraph
  70. Pingback: app maker free
  71. Pingback: Pickering house
  72. Pingback: wildlife removal
  73. Pingback: izmir escort
  74. Pingback: zip lookup
  75. Pingback: escort kadıköy
  76. Pingback: fakir selim
  77. Pingback: 他妈的谷歌
  78. Pingback: Wirtschaftskanzlei
  79. Pingback: Nachrichten
  80. Pingback: Dart Supplies
  81. Pingback: Boat Shipping
  82. Pingback: Led faucet light
  83. Pingback: porno izle
  84. Pingback: Sac kraft
  85. Pingback: Queen Hair
  86. Pingback: Source
  87. Pingback: house listings
  88. Pingback: buy homes
  89. Pingback: maternal pillows
  90. Pingback: datingtipsportal
  91. Pingback: brazzers porno
  92. Pingback: porno izle
  93. Pingback: patio dining sets
  94. Pingback: Used cars Burbank
  95. Pingback: 他妈的谷歌
  96. Pingback: 牛混蛋
  97. Pingback: 黃牛
  98. Pingback: coworking vacation
  99. Pingback: Enterprise
  100. Pingback: Sports
  101. Pingback: supplies
  102. Pingback: friv
  103. Pingback: travel
  104. Pingback: Anime Online
  105. Pingback: dig this
  106. Pingback: porno izle
  107. Pingback: dpboss
  108. Pingback: trading
  109. Pingback: Windows Versions
  110. Pingback: siktir git
  111. Pingback: sex
  112. Pingback: 福井脱毛
  113. Pingback: sikerler
  114. Pingback: Taxi St. Anton
  115. Pingback: 출장안마
  116. Pingback: thread
  117. Pingback: Doctor
  118. Pingback: 1MBI300F-120
  119. Pingback: Les Misérables
  120. Pingback: tampa magazine
  121. Pingback: kalyan matka
  122. Pingback: Spezialturen
  123. Pingback: 可憐的私生
  124. Pingback: picccc
  125. Pingback: Home Surveillance
  126. Pingback: chrome paint uk
  127. Pingback: scandals
  128. Pingback: Home Surveillance
  129. Pingback: new technologies
  130. Pingback: awiz for love
  131. Pingback: more info
  132. Pingback: baltic siker
  133. Pingback: Sexy buttons
  134. Pingback: employees at home
  135. Pingback: Stylish Rompers
  136. Pingback: ski resorts
  137. Pingback: wedding
  138. Pingback: robert
  139. Pingback: porn movies
  140. Pingback: amil baba
  141. Pingback: skin beauty
  142. Pingback: satta matka
  143. Pingback: SATTA MATKA
  144. Pingback: useful source
  145. Pingback: Nursing Homes CT
  146. Pingback: this website
  147. Pingback: Free online games
  148. Pingback: slot games gratis
  149. Pingback: iPhone repair
  150. Pingback: customer service
  151. Pingback: Click here
  152. Pingback: Youtube to mp3
  153. Pingback: how to make an app
  154. Pingback: Free online games
  155. Pingback: Free online games
  156. Pingback: Please use:
  157. Pingback: o.ç piç
  158. Pingback: fuck google
  159. Pingback: istanbul escort
  160. Pingback: siktir git
  161. Pingback: mdansby.com
  162. Pingback: pic
  163. Pingback: kaybol
  164. Pingback: Baler
  165. Pingback: Manufacturers
  166. Pingback: Surplus Supply
  167. Pingback: You See
  168. Pingback: Why Not
  169. Pingback: Recommended
  170. Pingback: This is Great Deal
  171. Pingback: More Info Here
  172. Pingback: Google Hummingbird
  173. Pingback: Health and Fitness
  174. Pingback: chopped mentality
  175. Pingback: coffee beans kona
  176. Pingback: pure kona
  177. Pingback: pure kona
  178. Pingback: 100% kona
  179. Pingback: River Cruises
  180. Pingback: რაგბი
  181. Pingback: buy kona
  182. Pingback: buy kona coffee
  183. Pingback: games
  184. Pingback: pure kona
  185. Pingback: pure kona coffee
  186. Pingback: RSS News Feed
  187. Pingback: Latest Blog RSS
  188. Pingback: Young Stunner
  189. Pingback: adult toys
  190. Pingback: Exercise
  191. Pingback: Backlinkss
  192. Pingback: Glow in the decals
  193. Pingback: sell website
  194. Pingback: Free Trial Hosting
  195. Pingback: image hosting
  196. Pingback: free classifieds
  197. Pingback: Free Piano
  198. Pingback: radleyo List
  199. Pingback: radl List
  200. Pingback: social media

Leave a Reply