Technical Interviews: Difference between revisions
(7 intermediate revisions by the same user not shown) | |||
Line 5: | Line 5: | ||
==Algorithms== | ==Algorithms== | ||
{{main | Interview Algorithms}} | {{main | Interview Algorithms}} | ||
In general, consider the following approaches | |||
* Brute-force and searching | |||
* Dynamic Programming | |||
* Greedy (sorting + take the best at each step) | |||
* Linear searching (with 2 pointers or sliding window) | |||
** Usually the answer if you're searching for a connected segment along an array or string | |||
** E.g. max contiguous subarray | |||
Be familiar with common algorithms and data structures: | |||
;Sorting | |||
* Merge sort | |||
* Quicksort | |||
* Quickselect | |||
* Counting sort | |||
* Radix sort | |||
* Binary search | |||
;Linked-lists | |||
* Doubly linked list, circular linked list | |||
;Trees | |||
* Preorder, inorder, postorder traversal | |||
* AVL rotations | |||
* Union-Find | |||
* Heaps | |||
* Using <code>std::set</code>, <code>std::map</code>, <code>std::priority_queue</code> | |||
;Graphs | |||
* DFS, BFS | |||
* Minimum spanning tree (Prim's or Kruskal's algorithm) | |||
;Hashing | |||
* Using <code>std::unordered_set</code> and <code>std::unordered_map</code> | |||
==Computer Systems== | ==Computer Systems== | ||
Line 33: | Line 68: | ||
* Make sure you can identify server from id of data (e.g. adding it to hashing or maintaining an index) | * Make sure you can identify server from id of data (e.g. adding it to hashing or maintaining an index) | ||
====Caches | ====Caches==== | ||
* Place a cache on the request layer. | * Place a cache on the request layer. | ||
* Cache in memory, not disk (e.g. [http://memcached.org/ memcached], redis) | * Cache in memory, not disk (e.g. [http://memcached.org/ memcached], redis) | ||
Line 47: | Line 82: | ||
* Collapsed Forwarding: collapse the same or similar requests and return the same result to clients | * Collapsed Forwarding: collapse the same or similar requests and return the same result to clients | ||
* You can also collapse requests that are spatially close together on the database | * You can also collapse requests that are spatially close together on the database | ||
====Indexes==== | ====Indexes==== | ||
* Indexes are used to quickly find small data in large datasets | |||
* Indexes are stored in memory, data separated across several servers | |||
* Possibly many layers of indexes | |||
====Load Balancers==== | ====Load Balancers==== | ||
Line 67: | Line 104: | ||
==System Design== | ==System Design== | ||
==Misc== | |||
* [https://steve-yegge.blogspot.com/2008/03/get-that-job-at-google.html Steve Yegge Get that job at Google] |
Latest revision as of 05:53, 17 May 2024
Resources for Technical Interviews
Algorithms
In general, consider the following approaches
- Brute-force and searching
- Dynamic Programming
- Greedy (sorting + take the best at each step)
- Linear searching (with 2 pointers or sliding window)
- Usually the answer if you're searching for a connected segment along an array or string
- E.g. max contiguous subarray
Be familiar with common algorithms and data structures:
- Sorting
- Merge sort
- Quicksort
- Quickselect
- Counting sort
- Radix sort
- Binary search
- Linked-lists
- Doubly linked list, circular linked list
- Trees
- Preorder, inorder, postorder traversal
- AVL rotations
- Union-Find
- Heaps
- Using
std::set
,std::map
,std::priority_queue
- Graphs
- DFS, BFS
- Minimum spanning tree (Prim's or Kruskal's algorithm)
- Hashing
- Using
std::unordered_set
andstd::unordered_map
Computer Systems
Web Architecture
Four Key Principles
- Availability
- Performance
- Reliability
- Scalability
- Manageability
- Cost
Services
- Separate each functionality into its own service so each can be scaled separately
- (e.g. Reading and Writing can be two services which access the same file store)
- Each service can then be managed separately.
Redundency
- Replicate nodes for each service using a "shared-nothing architecture."
- Each node should be able to operate independently.
Partitions
- Also known as shards
- Scaling vertically: Add more hard drives, memory
- Scaling Horizontally: Add more nodes
- Distribute data somehow: geographically, by type of user,...
- Make sure you can identify server from id of data (e.g. adding it to hashing or maintaining an index)
Caches
- Place a cache on the request layer.
- Cache in memory, not disk (e.g. memcached, redis)
- Global Caches
- If you have several request nodes, you can put a global cache between the request node and the database
- Distributed Cache
- Each request node holds a cache for a specific type of data
- A request node may forward requests to other request nodes which have the corresponding cache
- Cons: May be hard to remedy if a node goes down
Proxies
- Used to filter, log, and modify requests
- Collapsed Forwarding: collapse the same or similar requests and return the same result to clients
- You can also collapse requests that are spatially close together on the database
Indexes
- Indexes are used to quickly find small data in large datasets
- Indexes are stored in memory, data separated across several servers
- Possibly many layers of indexes
Load Balancers
- Purpose: Handle many connections and route them to request nodes
- Algorithms: Random, round robin, custom criteria
- Software: HAProxy
- Usually placed at the front of a distributed system
- Load balancers can send requests to other load balancers
- Challenges: Managing user session data
- Solutions: Caches, cookies, user data, URL rewriting
- Cons: Makes problem diagnosis difficult
Queues
- Schedule client tasks in a queue, acknowledge task received immediately
- Queues protect from service outages and failures
- Software: RabbitMQ, ActiveMQ