A personal diary of DataFrame munging over the years.
Convert Series datatype to numeric (will error if column has non-numeric values)
(h/t @makmanalp)
| So you've cloned somebody's repo from github, but now you want to fork it and contribute back. Never fear! | |
| Technically, when you fork "origin" should be your fork and "upstream" should be the project you forked; however, if you're willing to break this convention then it's easy. | |
| * Off the top of my head * | |
| 1. Fork their repo on Github | |
| 2. In your local, add a new remote to your fork; then fetch it, and push your changes up to it | |
| git remote add my-fork [email protected] |
| ISO 3166 Country Code | ISO639-2 Country Code | Country | ISO 3166 Country Code | ISO639-2 Lang | Language | Date Format | |
|---|---|---|---|---|---|---|---|
| ALB | AL | Albania | sqi | sq | Albanian | yyyy-MM-dd | |
| ARE | AE | United Arab Emirates | ara | ar | Arabic | dd/MM/yyyy | |
| ARG | AR | Argentina | spa | es | Spanish | dd/MM/yyyy | |
| AUS | AU | Australia | eng | en | English | d/MM/yyyy | |
| AUT | AT | Austria | deu | de | German | dd.MM.yyyy | |
| BEL | BE | Belgium | fra | fr | French | d/MM/yyyy | |
| BEL | BE | Belgium | nld | nl | Dutch | d/MM/yyyy | |
| BGR | BG | Bulgaria | bul | bg | Bulgarian | yyyy-M-d | |
| BHR | BH | Bahrain | ara | ar | Arabic | dd/MM/yyyy |
| from time import sleep | |
| from tornado.httpserver import HTTPServer | |
| from tornado.ioloop import IOLoop | |
| from tornado.web import Application, asynchronous, RequestHandler | |
| from multiprocessing.pool import ThreadPool | |
| _workers = ThreadPool(10) | |
| def run_background(func, callback, args=(), kwds={}): | |
| def _callback(result): |
| set langmap=ёйцукенгшщзхъфывапролджэячсмитьбюЁЙЦУКЕНГШЩЗХЪФЫВАПРОЛДЖЭЯЧСМИТЬБЮ;`qwertyuiop[]asdfghjkl\\;'zxcvbnm\\,.~QWERTYUIOP{}ASDFGHJKL:\\"ZXCVBNM<> | |
| nmap Ж : | |
| " yank | |
| nmap Н Y | |
| nmap з p | |
| nmap ф a | |
| nmap щ o | |
| nmap г u | |
| nmap З P |
| [MASTER] | |
| # Specify a configuration file. | |
| #rcfile= | |
| # Python code to execute, usually for sys.path manipulation such as | |
| # pygtk.require(). | |
| #init-hook= | |
| # Profiled execution. |
| require 'formula' | |
| class GraphTool < Formula | |
| homepage 'http://graph-tool.skewed.de/' | |
| url 'http://downloads.skewed.de/graph-tool/graph-tool-2.2.31.tar.bz2' | |
| sha1 '5e0b1c215ecd76191a82c745df0fac17e33bfb09' | |
| head 'https://github.com/count0/graph-tool.git' | |
| depends_on 'pkg-config' => :build | |
| depends_on 'boost' => 'c++11' |
A personal diary of DataFrame munging over the years.
Convert Series datatype to numeric (will error if column has non-numeric values)
(h/t @makmanalp)
If you have gcc > 5.0 installed on your server you can use my anaconda package which I compiled with openmp and boost 1.60:
just use conda install graph-tool -c floriangeigl -c msarahan -c conda-forge -c bioconda -c ostrokach -c vgauthier -c salford_systems to install it.
| # A simple cheat sheet of Spark Dataframe syntax | |
| # Current for Spark 1.6.1 | |
| # import statements | |
| from pyspark.sql import SQLContext | |
| from pyspark.sql.types import * | |
| from pyspark.sql.functions import * | |
| #creating dataframes | |
| df = sqlContext.createDataFrame([(1, 4), (2, 5), (3, 6)], ["A", "B"]) # from manual data |