@@ Line 1: / Line 1: @@
 __FORCETOC__
-=Installation=
+Python is the primary language used for data scientists.<br>
+It features some of the most useful scientific computing and machine learning libraries such as Numpy, Tensorflow, and PyTorch.
+==Installation==
 Use [http://anaconda.com Anaconda].
+* Add <code>C:\Users\[username]\Anaconda3\Scripts</code> to your path
+* Run <code>conda init</code> in your bash
+* Run <code>conda config --set auto_activate_base false</code>
+==Usage==
+How to use Python 3.
+===pip===
+Pip is the package manager for python.<br>
+Your package requirements should be written to <code>requirements.txt</code><br>
+Install all requirements using <code>pip install -r requirements.txt</code>
+===Syntax===
+====Ternary Operator====
+[http://book.pythontips.com/en/latest/ternary_operators.html Reference]
+<syntaxhighlight lang="python">
+is_nice = True
+state = "nice" if is_nice else "not nice"
+</syntaxhighlight>
-=Basic Usage=
+====Lambda Function====
-===Lambda Function===
 <syntaxhighlight lang="python">
 lambda x: x * 2
 </syntaxhighlight>
-==Filesystem Read and Write==
+====Spread====
-===List all files in a folder===
+[https://stackoverflow.com/questions/1993727/expanding-tuples-into-arguments Reference]
+<syntaxhighlight lang="python">
+myfun(*tuple)
+</syntaxhighlight>
+====For loops====
+<syntaxhighlight lang="python">
+# Normal for loop
+for i in range(5):
+  pass
+# 2D for loop
+for i, j in np.ndindex((5, 5)):
+  pass
+</syntaxhighlight>
+===Strings===
+====String Interpolation====
+[https://www.programiz.com/python-programming/string-interpolation Reference]<br>
+Python has 3 syntax variations for string interpolation.
+<syntaxhighlight lang="python">
+name = 'World'
+program = 'Python'
+print(f'Hello {name}! This is {program}')
+print("%s %s" % ('Hello','World',))
+name = 'world'
+program ='python'
+print('Hello {}! This is {}.'.format(name, program))
+print('Hello {name}! This is {program}.'.format(name=name, program=program))
+# Format to two decimal places
+print(f"Accuracy: {accuracy:.02f}%")
+# Format an int to 2 digits
+print(f"Size: {size:02}%")
+</syntaxhighlight>
+===Arrays===
+Use Numpy to provide array functionality
+====Array Indexing====
+[https://numpy.org/doc/stable/user/basics.indexing.html Numpy Indexing]
+Numpy has very powerful indexing. See the above reference.
+===Filesystem===
+====Paths====
+Use [https://docs.python.org/3/library/os.path.html <code>os.path</code>]
+<syntaxhighlight lang="python">
+import os.path as path
+my_file = path.join("folder_1", "my_great_dataset.tar.gz")
+#  "folder_1\\my_great_dataset.tar.gz"
+# Get the filename with extension
+filename = path.basename(my_file)
+# "my_great_dataset.tar.gz"
+# Get the filename without extension
+filename_no_ext = path.splitext(filename)[0]
+# Note that splitext returns ("my_great_dataset.tar", ".gz")
+</syntaxhighlight>
+If using Python >=3.4, you also have [https://docs.python.org/3/library/pathlib.html <code>pathlib</code>]
+<syntaxhighlight lang="python">
+from pathlib import Path
+p = Path("my_folder")
+# Join paths
+pp = Path(p, "files.tar.gz")
+pp.suffix      # returns ".gz"
+pp.suffixes    # returns [".tar", ".gz"]
+pp.name        # returns "files.tar.gz"
+pp.parent      # returns "my_folder"
+</syntaxhighlight>
+;Notes
+* One annoyance with <code>pathlib.Path</code> is that you need to convert things to strings manually
+** This can be done with <code>str</code>, <code>.resolve()</code>, or <code>os.fspath()</code>
+* [https://treyhunner.com/2019/01/no-really-pathlib-is-great/ "No really, pathlib is great" by Trey Hunger]
+====List all files in a folder====
 [https://stackoverflow.com/questions/3207219/how-do-i-list-all-files-of-a-directory Reference]
 <syntaxhighlight lang="python">
-gazeDir = "Gaze_txt_files"
+gaze_directory = "Gaze_txt_files"
 # List of folders in root folder
-gazeFolders = [path.join(gazeDir, x) for x in os.listdir(gazeDir)]
+gaze_folders = [path.join(gaze_directory, x) for x in os.listdir(gaze_directory)]
 # List of files 2 folders down
-gazeFiles = [path.join(x, y) for x in gazeFolders for y in os.listdir(x)]
+gaze_files = [path.join(x, y) for x in gaze_folders for y in os.listdir(x)]
 </syntaxhighlight>
-===Read entire text file into a list===
+See also glob.
+====Read/Write entire text file into a list====
+Reading <br>
 [https://stackoverflow.com/questions/3925614/how-do-you-read-a-file-into-a-list-in-python/3925701]
 <syntaxhighlight lang="python">
 with open('C:/path/numbers.txt') as f:
      lines = f.read().splitlines()
+</syntaxhighlight>
+Writing<br>
+[https://stackoverflow.com/questions/899103/writing-a-list-to-a-file-with-python/899176]
+<syntaxhighlight lang="python">
+with open('your_file.txt', 'w') as f:
+    f.write("\n".join(my_list))
+</syntaxhighlight>
+====Directories====
+Create, Move, and Delete directories or folders
+<syntaxhighlight lang="python">
+import os, shutil, time
+import os.path as path
+# Create a directory
+os.makedirs("new_dir", exist_ok=True)
+# or os.makedirs(os.path.dirname("new_dir/my_file.txt"), exist_ok=True)
+# Delete an empty directory
+os.rmdir(dir_path)
+# Delete an empty or non-empty directory
+shutil.rmtree(dir_path)
+# Wait until it is deleted
+while os.path.isdir(dir_path):
+  time.sleep(0.01)
+</syntaxhighlight>
+====Copying or moving a file or folder====
+[https://stackoverflow.com/questions/123198/how-do-i-copy-a-file-in-python/30359308 Copying]<br>
+[https://docs.python.org/3.7/library/shutil.html Shutil docs]
+<syntaxhighlight lang="python">
+import shutil
+# Copy a file
+shutil.copy2('original.txt', 'duplicate.txt')
+# Move a file
+shutil.move('original.txt', 'my_folder/original.txt')
 </syntaxhighlight>
-==Regular Expressions (Regex)==
+===Regular Expressions (Regex)===
 [https://docs.python.org/3/howto/regex.html Reference]<br>
+<syntaxhighlight lang="python">
+import re
+my_regex = re.compile(r'height:(\d+)cm')
+my_match = my_regex.match("height:33cm");
+print(my_match[1])
+# 33
+</syntaxhighlight>
+;Notes
+* <code>re.match</code> will return None if there is no match
+* <code>re.match</code> matches from the beginning of the string
+* Use <code>re.search</code> to match from anywhere in the string
+* Use <code>re.findall</code> to find all occurrences from anywhere in the string
+===Spawning Processes===
+Use [https://docs.python.org/3/library/subprocess.html subprocess] to spawn other programs.
+<syntaxhighlight lang="python">
+import subprocess
+subprocess.run(["ls", "-l"], cwd="/")
+</syntaxhighlight>
+===Timing Code===
+[https://stackoverflow.com/questions/2866380/how-can-i-time-a-code-segment-for-testing-performance-with-pythons-timeit StackOverflow]<br>
+[https://docs.python.org/3/library/time.html Python Time Documentation]
+* <code>time.time()</code> return the seconds since epoch as a float
+* You can also use timeit to time over several iterations
+<syntaxhighlight lang="python">
+import time
+t0 = time.time()
+code_block
+t1 = time.time()
+total = t1-t0
+</syntaxhighlight>
+===requests===
+Use the requests library to download files and scrape webpages<br>
+See [https://www.geeksforgeeks.org/get-post-requests-using-python/ Get and post requests in Python]
+====Get Request====
+<syntaxhighlight lang="python">
+import requests
+url = R"https://www.google.com"
+req = requests.get(url)
+req.text
+# To save to disk
+with open("google.html", "wb") as f:
+  f.write(req.content)
+</syntaxhighlight>
+====Post Request====
+<syntaxhighlight lang="python">
+data = {'api_dev_key':API_KEY,
+        'api_option':'paste',
+        'api_paste_code':source_code,
+        'api_paste_format':'python'}
+# sending post request and saving response as response object
+r = requests.post(url = API_ENDPOINT, data = data)
+# extracting response text
+pastebin_url = r.text
+print("The pastebin URL is:%s"%pastebin_url)
+</syntaxhighlight>
+====Download a file====
+[https://stackoverflow.com/questions/16694907/download-large-file-in-python-with-requests SO Answer]
+<syntaxhighlight lang="python">
+def download_file(url, folder=None, filename=None):
+    if filename is None:
+        filename = path.basename(url)
+    if folder is None:
+        folder = os.getcwd()
+    full_path = path.join(folder, filename)
+    temp_path = path.join(folder, str(uuid.uuid4()))
+    with requests.get(url, stream=True) as r:
+        r.raise_for_status()
+        with open(temp_path, 'wb') as f:
+            for chunk in r.iter_content(chunk_size=8192):
+                if chunk:
+                    f.write(chunk)
+    shutil.move(temp_path, full_path)
+    return full_path
+</syntaxhighlight>
+===if main===
+[https://stackoverflow.com/questions/419163/what-does-if-name-main-do What does if __name__ == "__main__" do?]
+If you are writing a script with functions you want to be included in other scripts, use <code>__name__</code> to detect if your script is being run or being imported.
+<syntaxhighlight lang="python">
+if __name__ == "__main__":
+  # do..something..here
+</syntaxhighlight>
+===iterators and iterables===
+Iterables include lists, np arrays, tuples.
+To create an iterator, pass an iterable to the <code>iter()</code> function.
+<syntaxhighlight lang="python">
+my_arr = [1,2,3,4]
+my_iter = iter(my_arr)
+v1 = my_iter.next()
+</syntaxhighlight>
+<code>itertools</code> contains many helper functions for interacting with iterables and iterators.
+====zip====
+[https://docs.python.org/3/library/functions.html#zip documentation]
+zip takes two iterables and combines them into an iterator of tuples
+i.e. zip([a1, ...], [b1,...]) = [(a1, b1), ...]
+====enumerate====
+[https://docs.python.org/3/library/functions.html#enumerate documentation]
+enumerate adds indices to an iterable
+i.e. enumerate([a1,...], start=0) = [(0, a1), (1, a2), ...]
+====slice====
+<code>itertools.islice</code> will allow you to create a slice from an iterable
+<syntaxhighlight lang="python">
+from itertools import islice
+import numpy as np
+a = np.arange(5)
+b = islice(a, 3)
+list(b) # [0,1,2]
+</syntaxhighlight>
+===Exceptions===
+See [https://docs.python.org/3/library/exceptions.html https://docs.python.org/3/library/exceptions.html]
+;Raising
+<syntaxhighlight lang="python">
+raise ValueError("You have bad inputs")
+assert 1=1, "Something is very wrong if 1!=1"
+</syntaxhighlight>
+;Try Catch/Except
+<syntaxhighlight lang="python>
+try:
+  something_which_may_raise()
+except AssertError as error:
+  do_fallback()
+  raise # Raise the previous error.
+else:
+  do_something_if_no_exception()
+finally:
+  finish_program_and_cleanup()
+</syntaxhighlight>
+==Classes==
+===Static and Class methods===
+See [https://realpython.com/instance-class-and-static-methods-demystified/ realpython]
+<syntaxhighlight lang="python">
+class MyClass:
+    def method(self):
+        return 'instance method called', self
+    @classmethod
+    def classmethod(cls):
+        return 'class method called', cls
+    @staticmethod
+    def staticmethod():
+        return 'static method called'
+</syntaxhighlight>
+;Notes
+* That the Google Python style guide discourages use of static methods.
+** Class methods should only be used to define alternative constructors (e.g. from_matrix).
+==Multithreading==
+===threading===
+[https://docs.python.org/3/library/threading.html?highlight=threading#module-threading <code>import threading</code>]
+Use <code>threading.Thread</code> to create a thread.
+===concurrrency===
+In Python 3.2+, [https://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futures <code>concurrent.futures</code>] gives you access to thread pools.
+<syntaxhighlight lang="python">
+import os
+import threading
+from concurrent.futures import ThreadPoolExecutor, as_completed
+executor = ThreadPoolExecutor(max_workers=os.cpu_count())
+thread_lock = threading.Lock()
+total = 0
+def do_something(a, b):
+  with thread_lock:
+    total += a + b
+  return total
+my_futures = []
+for i in range(5):
+  future = executor.submit(do_something, 1, 2+i)
+  my_futures.append(future)
+for future in as_completed(my_futures):
+   future.result()
+executor.shutdown()
+</syntaxhighlight>
+* <code>len(os.sched_getaffinity(0))</code> returns the number of threads available to the Python process.
+* Starting in Python 3.5, if <code>max_workers</code> is none, it defaults to <code>5 * os.cpu_count()</code>.
+** <code>os.cpu_count()</code> returns the number of logical CPUs (i.e. threads)
+* <code>executor.shutdown()</code> will wait for all jobs to finish but you cannot submit any additional jobs from other threads, after calling shutdown.
+* List operations are thread-safe but most other operations will require using a thread lock or semaphore.
+==Data Structures==
+===Tuples===
+Tuples are immutable lists. This means that have fixed size and fixed elements, though elements themselves may be mutable.
+In general, they perform marginally faster than lists so you should use tuples over lists when possible, especially as parameters to functions.
+Typically people use tuples as structs, i.e. objects with structure such as coordinates. See [https://stackoverflow.com/questions/626759/whats-the-difference-between-lists-and-tuples StackOverflow: Difference between lists and tuples].
+<syntaxhighlight lang="python">
+# Tuple with one element
+m_tuple = (1,)
+# Tuple with multiple elements
+vals = (1,2,3, "car")
+# Return a tuple
+def int_divide(a, b):
+  return a // b, a % b
+</syntaxhighlight>
+===Lists===
+The default data structure in Python is lists.<br>
+A lot of functional programming can be done with lists<br>
+<syntaxhighlight lang="python">
+groceries = ["apple", "orange"]
+groceries.reverse()
+# ["orange", "apple"]
+groceries_str = ",".join(groceries)
+# "apple,orange"
+groceries_str.split(",")
+# ["apple", "orange"]
+# Note that functions such as map, enumerate, range return enumerable items
+# which you can iterate over in a for loop
+# You can also convert these to lists by calling list() if necessary
+enumerate(groceries)
+# [(0, "apple"), (1, "orange")]
+</syntaxhighlight>
+===Dictionaries===
+Dictionaries are hashmaps in Python<br>
+<syntaxhighlight lang="python">
+# Create a dictionary
+my_map = {}
+# Or
+my_map = {1: "a", 2: "b"}
+# Check if a key is in a dictionary
+# O(1)
+in my_map
+# Check if a value is in a dictionary
+# Usually you should have a second dictionary if you need this functionality
+# O(n)
+'a' in d.values()
+# Loop through dictionary
+for k in my_map:
+   print(k)
+# With key and value
+for k, v in my_map.items():
+   print(k, v)
+</syntaxhighlight>
-=Libraries=
 ==Numpy==
+{{main | NumPy}}
+See also Cupy which is a numpy interface implemented with CUDA for GPU acceleration. Large speedups can be had for big arrays.
+===random===
+Legacy code uses functions from <code>np.random.*</code>.
+New code should initialize a rng using <code>np.random.default_rng()</code>.
+See [https://numpy.org/doc/stable/reference/random/generator.html Random Generator] for more details.
+<syntaxhighlight lang="python">
+import numpy as np
+rng = np.random.default_rng()
+# Random integer between [0, 6)
+rng.integers(0, 6)
+# array of 5 random integers
+rng.integers(0, 6, size=5)
+</syntaxhighlight>
+==Anaconda==
+{{main | Anaconda (Python distribution) }}
+How to use Anaconda:
+<syntaxhighlight lang="bash">
+# Create an environment
+conda create -n tf2 python=3.6
+# Activate an environment
+conda activate tf2
+# Change version of Python
+conda install python=3.7
+# Update all packages
+conda update --all
+</syntaxhighlight>
+;Documentation
+* [https://docs.conda.io/projects/conda/en/latest/commands/install.html <code>conda install</code>]
+;Notes
+* Use flag <code>--force-reinstall</code> to reinstall packages
+==JSON==
+[https://docs.python.org/3/library/json.html Documentation]
+<syntaxhighlight lang="python">
+import json
+# Encode/Stringify (pretty)
+json.dumps({})
+# Decode/Parse
+json.loads("{}")
+# Write to file
+with open("my_data.json", "w") as f:
+  json.dump(my_data, f, indent=2)
+# Read from file
+with open("my_data.json", "r") as f:
+  my_data = json.load(f)
+</syntaxhighlight>
+; Notes
+* Using <code>json.dump(data, f)</code> will dump without pretty printing
+** Add indent parameter for pretty printing.
+==Type Annotations==
+Python 3 supports adding type annotations. However it is not enforced at runtime.
+You can check types ahead of time using [https://google.github.io/pytype/ pytype].
+<syntaxhighlight lang="python">
+function add_two_values(a: float, b: float) -> float:
+    return a + b
+</syntaxhighlight>
+==Images==
+===Pillow (PIL)===
+<code>pip install pillow</code>
+<syntaxhighlight lang="python">
+from PIL import Image, ImageOps
+img = Image.open("my_image.png")
+# Converts to int array of shape (H,W,4)
+img = np.array(img)
+</syntaxhighlight>
+* <code>ImageOps.flip(img)</code> - Returns an image flipped across y axis
+* <code>ImageOps.mirror(img)</code> - Returns an image flipped across x axis
+===Bilinear Interpolation===
+Coped from [https://stackoverflow.com/questions/12729228/simple-efficient-bilinear-interpolation-of-images-in-numpy-and-python https://stackoverflow.com/questions/12729228/simple-efficient-bilinear-interpolation-of-images-in-numpy-and-python]
+{{ hidden | Bilinear Interpolation function |
+<syntaxhighlight lang="python">
+def bilinear_interpolate(im, x, y):
+    """
+    Basic bilinear interpolation
+    :param im:
+    :param x:
+    :param y:
+    :return:
+    """
+    x = np.asarray(x)
+    y = np.asarray(y)
+    x0 = np.floor(x).astype(int)
+    x1 = x0 + 1
+    y0 = np.floor(y).astype(int)
+    y1 = y0 + 1
+    x0 = np.clip(x0, 0, im.shape[1] - 1)
+    x1 = np.clip(x1, 0, im.shape[1] - 1)
+    y0 = np.clip(y0, 0, im.shape[0] - 1)
+    y1 = np.clip(y1, 0, im.shape[0] - 1)
+    Ia = im[y0, x0]
+    Ib = im[y1, x0]
+    Ic = im[y0, x1]
+    Id = im[y1, x1]
+    wa = (x1 - x) * (y1 - y)
+    wb = (x1 - x) * (y - y0)
+    wc = (x - x0) * (y1 - y)
+    wd = (x - x0) * (y - y0)
+    if len(Ia.shape) > len(wa.shape):
+        wa = wa[..., np.newaxis]
+        wb = wb[..., np.newaxis]
+        wc = wc[..., np.newaxis]
+        wd = wd[..., np.newaxis]
+    return wa * Ia + wb * Ib + wc * Ic + wd * Id
+</syntaxhighlight>
+}}
+==Libraries==
+Other notable libraries.
+===Matplotlib===
+{{main | Matplotlib}}
+Matplotlib is the main library used for making graphs.<br>
+[https://matplotlib.org/examples/ Examples]<br>
+[https://matplotlib.org/3.1.1/gallery/index.html Gallery]
+Alternatively, there are also Python bindings for ggplot2<br>
+===configargparse===
+[https://pypi.org/project/ConfigArgParse/ ConfigArgParse] is the same as argparse except it allows you to use config files as args.
+<syntaxhighlight lang="python">
+parser = configargparse.ArgParser()
+parser.add('-c', '--config', is_config_file=True, help='config file path')
+# Parse all args, throw exception on unknown args.
+parser.parse_args()
+# Parse only known args.
+parser.parse_known_args()
+</syntaxhighlight>
+If you want to use bools without store-true or store-false, you need to define an str2bool function:
+[https://stackoverflow.com/questions/15008758/parsing-boolean-values-with-argparse Stack Overflow Answer]
+{{ hidden | str2bool |
+<syntaxhighlight lang="python">
+def str2bool(val):
+  """Converts the string value to a bool.
+  Args:
+    val: string representing true or false
+  Returns:
+    bool
+  """
+  if isinstance(val, bool):
+    return val
+  if val.lower() in ('yes', 'true', 't', 'y', '1'):
+    return True
+  elif val.lower() in ('no', 'false', 'f', 'n', '0'):
+    return False
+  else:
+    raise argparse.ArgumentTypeError('Boolean value expected.')
+#...
+parser.add_argument("--augment",
+                    type=str2bool,
+                    help="Augment",
+                    default=False)
+</syntaxhighlight>
+}}
+[[Category:Programming languages]]

Python: Difference between revisions