Technical Interviews: Difference between revisions

Latest revision as of 05:53, 17 May 2024

Resources for Technical Interviews

Algorithms

In general, consider the following approaches

Brute-force and searching
Dynamic Programming
Greedy (sorting + take the best at each step)
Linear searching (with 2 pointers or sliding window)
- Usually the answer if you're searching for a connected segment along an array or string
- E.g. max contiguous subarray

Be familiar with common algorithms and data structures:

Sorting

Merge sort
Quicksort
Quickselect
Counting sort
Radix sort
Binary search

Linked-lists

Doubly linked list, circular linked list

Trees

Preorder, inorder, postorder traversal
AVL rotations
Union-Find
Heaps
Using std::set, std::map, std::priority_queue

Graphs

DFS, BFS
Minimum spanning tree (Prim's or Kruskal's algorithm)

Hashing

Using std::unordered_set and std::unordered_map

Computer Systems

Web Architecture

Scalable Web Architecture and Distributed Systems

Four Key Principles

Availability
Performance
Reliability
Scalability
Manageability
Cost

Services

Separate each functionality into its own service so each can be scaled separately

(e.g. Reading and Writing can be two services which access the same file store)

Each service can then be managed separately.

Redundency

Replicate nodes for each service using a "shared-nothing architecture."
Each node should be able to operate independently.

Partitions

Also known as shards
Scaling vertically: Add more hard drives, memory
Scaling Horizontally: Add more nodes
Distribute data somehow: geographically, by type of user,...
Make sure you can identify server from id of data (e.g. adding it to hashing or maintaining an index)

Caches

Place a cache on the request layer.
Cache in memory, not disk (e.g. memcached, redis)

Global Caches

If you have several request nodes, you can put a global cache between the request node and the database

Distributed Cache

Each request node holds a cache for a specific type of data
A request node may forward requests to other request nodes which have the corresponding cache
Cons: May be hard to remedy if a node goes down

Proxies

Used to filter, log, and modify requests
Collapsed Forwarding: collapse the same or similar requests and return the same result to clients
You can also collapse requests that are spatially close together on the database

Indexes

Indexes are used to quickly find small data in large datasets
Indexes are stored in memory, data separated across several servers
Possibly many layers of indexes

Load Balancers

Purpose: Handle many connections and route them to request nodes
Algorithms: Random, round robin, custom criteria
Software: HAProxy
Usually placed at the front of a distributed system
- Load balancers can send requests to other load balancers
Challenges: Managing user session data
- Solutions: Caches, cookies, user data, URL rewriting
Cons: Makes problem diagnosis difficult

@@ Line 5: / Line 5: @@
 ==Algorithms==
 {{main | Interview Algorithms}}
+In general, consider the following approaches
+* Brute-force and searching
+* Dynamic Programming
+* Greedy (sorting + take the best at each step)
+* Linear searching (with 2 pointers or sliding window)
+** Usually the answer if you're searching for a connected segment along an array or string
+** E.g. max contiguous subarray
+Be familiar with common algorithms and data structures:
+;Sorting
+* Merge sort
+* Quicksort
+* Quickselect
+* Counting sort
+* Radix sort
+* Binary search
+;Linked-lists
+* Doubly linked list, circular linked list
+;Trees
+* Preorder, inorder, postorder traversal
+* AVL rotations
+* Union-Find
+* Heaps
+* Using <code>std::set</code>, <code>std::map</code>, <code>std::priority_queue</code>
+;Graphs
+* DFS, BFS
+* Minimum spanning tree (Prim's or Kruskal's algorithm)
+;Hashing
+* Using <code>std::unordered_set</code> and <code>std::unordered_map</code>
 ==Computer Systems==
-===Web Architecture
+===Web Architecture===
 * [https://www.aosabook.org/en/distsys.html Scalable Web Architecture and Distributed Systems]
+Four Key Principles
+* Availability
+* Performance
+* Reliability
+* Scalability
+* Manageability
+* Cost
+====Services====
+* Separate each functionality into its own service so each can be scaled separately
+: (e.g. Reading and Writing can be two services which access the same file store)
+* Each service can then be managed separately.
+====Redundency====
+* Replicate nodes for each service using a "shared-nothing architecture."
+* Each node should be able to operate independently.
+====Partitions====
+* Also known as shards
+* Scaling vertically: Add more hard drives, memory
+* Scaling Horizontally: Add more nodes
+* Distribute data somehow: geographically, by type of user,...
+* Make sure you can identify server from id of data (e.g. adding it to hashing or maintaining an index)
+====Caches====
+* Place a cache on the request layer.
+* Cache in memory, not disk (e.g. [http://memcached.org/ memcached], redis)
+; Global Caches
+* If you have several request nodes, you can put a global cache between the request node and the database
+; Distributed Cache
+* Each request node holds a cache for a specific type of data
+* A request node may forward requests to other request nodes which have the corresponding cache
+* Cons: May be hard to remedy if a node goes down
+====Proxies====
+* Used to filter, log, and modify requests
+* Collapsed Forwarding: collapse the same or similar requests and return the same result to clients
+* You can also collapse requests that are spatially close together on the database
+====Indexes====
+* Indexes are used to quickly find small data in large datasets
+* Indexes are stored in memory, data separated across several servers
+* Possibly many layers of indexes
+====Load Balancers====
+* Purpose: Handle many connections and route them to request nodes
+* Algorithms: Random, round robin, custom criteria
+* Software: [https://www.haproxy.org/ HAProxy]
+* Usually placed at the front of a distributed system
+** Load balancers can send requests to other load balancers
+* Challenges: Managing user session data
+** Solutions: Caches, cookies, user data, URL rewriting
+* Cons: Makes problem diagnosis difficult
+====Queues====
+* Schedule client tasks in a queue, acknowledge task received immediately
+* Queues protect from service outages and failures
+* Software: RabbitMQ, ActiveMQ
 ==System Design==
+==Misc==
+* [https://steve-yegge.blogspot.com/2008/03/get-that-job-at-google.html Steve Yegge Get that job at Google]