Why You’re Not Getting Value from Your Data Science
If companies want to get value from their data, they need to focus on accelerating human understanding of data, scaling the number of modeling questions they...
If companies want to get value from their data, they need to focus on accelerating human understanding of data, scaling the number of modeling questions they...
Using ls-files
Using html in a Jupyter Notebook
Using the zip command
Link versions of a Docker image back to Git commits
Using pv, tee, and sha256sum
There’s more than one way to skin a cat
One character makes a big difference
Implement queues using the collections.deque module
Using git log
It is difficult to surpass the magnitude of the damage caused by two particular inventions, and both were created by the same man
Different approaches to extract the information you want
Using the allow-empty flag
Where do you want to take out?
When disaster strikes, how quickly will you recover?
Use repr() for Programmers vs str() for Users
Using wget and time
Celebration time!
Data viz tips from a 1914 book we can still learn from
Modern versions of common vintage tools
sponge and pee
Stack JPEGs vertically
Use the slots attribute to limit the attributes of the class
Playful thought experiments
Useful for direct copy & pasting of commands
How to overcome the communication limitations of the internet and actually help people
A useful header for bash scripts to avoid common bugs
Find stale symlinks optimally
The textwrap module to the rescue!
How and why to automatically use the walrus operator
Git’s cherry-picking syntax is easy to mess up
Ensure all layer commands are visible
Preserve environment when using sudo
Putting a backslash escape sequence into an f-string
Browse the world from your command line
How to create your own ‘with’
Quick one-liner to identify the backend
Quickly get the information you need to free a used port
Useful prior to changing a configuration file
Perl to the rescue!
Use re.VERBOSE to your advantage
Make your tools work for you!
Get the quotient and remainder for a fraction
Test run your jq filters
Better understand your shell commands
Via an open-source command line tool
Speed up your usage of notebooks
Another webpage to add to your toolkit
Validate your docker-compose.yml files
Using ordering and limit functions
Using a cross join or a subselect query
Using the expand option of pd.Series.str.split
Avoid landing data unnecessarily
Tidy up text output of various commands
Speed up your git clones
Our acceptance of violence today stands in striking contrast to Americans’ horror at the 1929 Valentine’s Day Massacre
Add notifications to your scripts or python code
Groupby expert level
Get a notification from your script
Save those valuable thoughts for later
Aid readability of numbers
How to modify a file owned by root
Tips from Towards Data Science
More control over display of your pandas objects
Easily construct pipelines that read from and write to the same file
Please be kind to your open source maintainers
A more useful default behavior
A great example of when to use the gitconfig includeif directive
Avoid prepending commands with sudo
Only log if threshold is set appropriately
Two troubleshooting tips
Create more readable code
How to enable regex flags
Do a dry run from the Terminal
Extra options for output file names
Understand why that process is taking so long
Works no matter where the script is being called from
Another use for pandoc
Skip the UI entirely
Quick tip using history expansion
Easily add and remove attributes
Additional methods useful in various scenarios
Wrap and fill lines
Using random.SystemRandom
du command to the rescue!
Using variable manipulation
Show Google Calendar interface even with a custom feed
Deep dive into homebrew python environments
The continue-on-error option
Handy one liners I use all the time
Fix garbled shell session
Of any size!
As a continuous stream
Using reboot and poweroff
Manage exceptions
From the command line
Using Applescript
And those who communicate with them
You won’t fall asleep with this!
The difference between a junior and senior Data Scientist
Hug your kids
The hashlib module
Fred Rogers’ tips
Equipped with cardboard
The read_clipboard function.
The aptly-named makeDataFrame function
The combine and combine_first functions
Via the plot_tree command
Visualize your sklearn pipelines in Jupyter
Specify which columns to apply the most appropriate preprocessing
Create a method chain in pandas
Visualize the cross-validation results
9 things to check with a new data set.
How to explain it to a layperson
Stop typing your password so many times
Simplify your groupbys
Using the xattr command
Via the softwareupdate command
Using the X option of the ls command
A factor of almost exactly a million
Using the aptly named multiprocessing module
Move fast and break things is an abomination if your goal is to create a healthy society
Via hub command or handy url
Cross-pollination at its best
A satirical guide to race issues
Fix this using your .ssh/config file
Annotate or summarize your stashes, and more
The most common values
Using the show full processlist command
A discriminating palate leads to novel rigorous statistical methods
Some quick one liners
The ANSI standard
The filter option
Using the expr function
Using the groupby function
Using the sntp command
Using the zoneinfo command
Using the -u option
Using the git worktree command
Using the cached flag
Using the untracked flag
Using the git stash save command
Using the k flag
A partial stash
Use git merge to squash large number of commits
Via various configuration options
Using the git show –stat command
Using the follow flag
Using the git reset –patch command
Using the git reflog command
Using the git reflog command
Using the git reflog command
Using the git show command
Via an interactive rebasing
The author flag
Summarizing your day’s work
The name-only flag
The left-right option
Using the git log command
Using the git update-index command
Using the git help command
Using the git hash-object command
Using the git log command
Adding a new type of large file to your repository
Addressing git checkout fails
Tidying up your local git LFS cache
Using the git lfs command
An extension of git
Extra configuration
How LFS integrates with git
Using git fetch origin –prune
Using git diff –name-only master
Via git push
The interactive.singlekey option
The core.editor option
Using the diff.algorithm option
With the amend author options
Via git clean
Using the hyphen shorthand
Using the appropriate hash
Using the checkout command
Using git checkout
A handy one-liner
Using git add -N
Using the savefig function
Free up the address pool
Via the JOBLIB_TEMP_FOLDER environment variable
Comparing two approaches
Interacting with containers
Using awscli
Via the “ci skip” suffix
Via the ‘text.usetex’ parameter
Using the pycat magic command
Via the store_var magic command
Via the env magic command
Via the InteractiveShell.ast_node_interactivity configuration variable
Be kind to future you, … and other developers
Via sqlalchemy
Sort a dataframe by index
Avoid inferring data types
Handle files too large for memory
Via the components property
Using the tilde operator
Via the plot axes.
The get_option, reset_option, and set_option functions.
Temporarily change pandas options
Specify the key column of the merge as the index of your dataframes, then join instead of merge
Set the configuration option display.memory_usage
Via the as_index parameter
Accessing the name and DataFrameGroupBy
Via the apply function
How to apply pandas.tseries.offsets.DateOffset
Via the count method
Using a dictionary of aggreations
Via the searchsorted command
The string representation of a GUID should not be relevant to a program
Using the aws iam list-account-aliases command
UNION removes duplicate records, whereas UNION ALL does not
Via the InlineBackend.figure_format configuration option
Via the precision argument
The characters palette
ditto is slightly more advanced but can be advantageous to cp for several reasons
Generate a DDL statement that can be used to recreate the specified object
Via the git worktree command
Via their Message Archives
Using an expansion operator
Using the shift-backspace command
Via the mathb.in service
Using travis requests
Taking advantage of poor password practices
Some datetime examples
Via the matplotlib.pyplot.suptitle command
Addressing the high dimensionality of these codes
List the file formats for which you have access privileges
Using Local Interpretable Model-Agnostic Explanations
Using select
Remove accents, etc.
Using ‘_’
Using the random module
Using the time module
From the tempfile module
Find a list of all python modules installed on a machine
Follow the convention of putting two underscores at the beginning of the variable’s name
Using the end parameter of the print function
Using the flush keyword
Control compatibility support
Using the natsorted function
The namedtuple
New to Python 3.6
A python gotcha
A function that remembers the values from the enclosing lexical scope even when the program flow is no longer in that scope
Defining a function inline
Using joblib’s Parallel function
Using sklearn.externals.joblib.Memory
Using items()
Using the hashlib module
Using the dis module
Just add ‘:,’ to the format specifier
Using the collections module
For example python_version and sys.platform
An exception to the general interchangeability
Using deepcopy
Using two fors within a comprehension
Using the gc module
Using the issubclass command
Using strip
Using np.timedelta64 itself
How to turn Mac on or off quickly
Using the size command
Using the aws config files
Using the gca command
Using the set_xscale and set_yscale commands
Using the fill_between command
cla v clf v close
The Guardian Project
Using np.nanmax
Using the lsof command
Using the lsb_release command
Using the dd command
Using the dd command
Using the yes command
Using the yes command
Using the where command
Using the –max-line-length flag
For example, watch ls
Can set a sampling period too
Via the tree command
Write to multiple files at the same time
Using the stress command
Using the –human-numeric-sort flag with sort
Using the –numeric-sort flag
Using the apt-cache policy command
Using the apropos command
Using the ack command
Advanced awk usage
Using the -L flag
Using the -N flag
Via the bash command
Execute half your command locally and half remotely
Something else to do rather than mashing the keyboard
Power user commands
Using the rev command
Using the repeat command
Using the qpdf command
Using the pv command
Using the -f flag
Make a command’s output appear to come from a file
Using a special parameter of bash
Using the pbcopy command on macOS
Using Ctrl-x Ctrl-e
On macOS
Using the last command
Using the lsof command
Using yum or apt-get
Grep for the hardware address (HWaddr)
Using control-r
Quick unix tool tip
Quick shortcut
Different substitution options
Using the C option
Using the o flag
Using the l flag
Using the e flag
Using the files-without-match flag
Using getconf to retrieve standard configuration variables
Lists process IDs of all processes that have one or more files open
Using the newer argument
Using the not and path arguments
Using the do-release-upgrade command
Use the no-clobber option of cp
Using the dig command
Using the head flag for curl
Using the user flag of curl
Using the expand/unexpand commands
tr is short for translate characters
Using the squeeze-blank option
Take a new shell for a spin!
Using the number option
Using the occur command
The spark.sql.shuffle.partitions configuration option
Deliberately cause a SHA-1 collision
Optimize your use of Spark DataFrames
Top tips
Mathematical genius resides within every one of us
Use your prefix twice to access inner tmux instance
Using the resize-pane command
Using $
Using x
Using control-u and control-d
Using new-window
Using the -L flag
Using the new command
Using the ls or list-sessions command
Using list-keys
Using kill-session
Using array_length and regexp_split_to_array
Using drop database
Using date_part
Using to_char
Using group by
Using the pg_typeof function
Using the pg_terminate_backend command
Using the pg_stat_activity table
Using the pg_sleep function
Using the pg_size_pretty and pg_relation_size functions
Using the pg_restore function
Using the pg_relation_size function
Using pg_dump function
Using the pg_database_size function
The cd meta-command
Intelligently display results vertically or horizontally
Create single or even multiple columns of values
The current_user variable
Stop repeating yourself
Trade crash-safety for speed
Array, Boolean, String, Numeric, Composite, etc.
Truncate in pairs or via a cascade
Use truncate rather than delete
Via the timing command
The table command
Set a hard timeout
Factorial, square root, absolute value operators
Via show config_file
Via show data_directory
Via show/set timezone
Allow for reproducibility
Execute SQL from the command line
Launch PSQL with a custom configuration
Update the default null
Via the position function
Via the crypt and gen_salt functions
Avoid the OSSP UUID library
md5, sha1, sha224, sha256, sha384 and sha512
Useful meta-commands
Using the du command
The dt command
Using the list command
Using the ‘default values’ options
Using the generate_series function
Using the copy function
Via the clear shell command
Ignore case in email addresses
Using brackets
With one or two dimensions
Using alter table
Using alter sequence
The aptly-named age function
Using the option key
Pick between installed versions
Quick way to generate a table of contents
Just add w=1 to the diff URL
A workaround for the limited unicode set
Using a wildcard
Move the cursor without the arrow keys
The meta key to the rescue
Many published studies are not reproducible!
People don’t trust black-box models
Get inspired to visualize!
If we don’t measure patient outcomes, how do we know how well our healthcare is doing?!
From the 20th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
Why provenance of data is important
Quick way of getting the syntax right
An American pilot shoots down an American plane
My first post
Since ancient times, mankind has studied the sky and wondered what the ‘wandering stars’ (planets) might be. In the last two decades, we have found hundreds ...
In the shadow of the ‘Big Eye’, this is the Little Telescope That Could…