Technical Interviews: Difference between revisions

From David's Wiki
 
(7 intermediate revisions by the same user not shown)
Line 5: Line 5:
==Algorithms==
==Algorithms==
{{main | Interview Algorithms}}
{{main | Interview Algorithms}}
In general, consider the following approaches
* Brute-force and searching
* Dynamic Programming
* Greedy (sorting + take the best at each step)
* Linear searching (with 2 pointers or sliding window)
** Usually the answer if you're searching for a connected segment along an array or string
** E.g. max contiguous subarray
Be familiar with common algorithms and data structures:
;Sorting
* Merge sort
* Quicksort
* Quickselect
* Counting sort
* Radix sort
* Binary search
;Linked-lists
* Doubly linked list, circular linked list
;Trees
* Preorder, inorder, postorder traversal
* AVL rotations
* Union-Find
* Heaps
* Using <code>std::set</code>, <code>std::map</code>, <code>std::priority_queue</code>
;Graphs
* DFS, BFS
* Minimum spanning tree (Prim's or Kruskal's algorithm)
;Hashing
* Using <code>std::unordered_set</code> and <code>std::unordered_map</code>


==Computer Systems==
==Computer Systems==
Line 33: Line 68:
* Make sure you can identify server from id of data (e.g. adding it to hashing or maintaining an index)
* Make sure you can identify server from id of data (e.g. adding it to hashing or maintaining an index)


====Caches=====
====Caches====
* Place a cache on the request layer.
* Place a cache on the request layer.
* Cache in memory, not disk (e.g. [http://memcached.org/ memcached], redis)
* Cache in memory, not disk (e.g. [http://memcached.org/ memcached], redis)
Line 47: Line 82:
* Collapsed Forwarding: collapse the same or similar requests and return the same result to clients
* Collapsed Forwarding: collapse the same or similar requests and return the same result to clients
* You can also collapse requests that are spatially close together on the database
* You can also collapse requests that are spatially close together on the database


====Indexes====
====Indexes====
* Indexes are used to quickly find small data in large datasets
* Indexes are stored in memory, data separated across several servers
* Possibly many layers of indexes


====Load Balancers====
====Load Balancers====
Line 67: Line 104:


==System Design==
==System Design==
==Misc==
* [https://steve-yegge.blogspot.com/2008/03/get-that-job-at-google.html Steve Yegge Get that job at Google]

Latest revision as of 05:53, 17 May 2024

Resources for Technical Interviews


Algorithms

In general, consider the following approaches

  • Brute-force and searching
  • Dynamic Programming
  • Greedy (sorting + take the best at each step)
  • Linear searching (with 2 pointers or sliding window)
    • Usually the answer if you're searching for a connected segment along an array or string
    • E.g. max contiguous subarray

Be familiar with common algorithms and data structures:

Sorting
  • Merge sort
  • Quicksort
  • Quickselect
  • Counting sort
  • Radix sort
  • Binary search
Linked-lists
  • Doubly linked list, circular linked list
Trees
  • Preorder, inorder, postorder traversal
  • AVL rotations
  • Union-Find
  • Heaps
  • Using std::set, std::map, std::priority_queue
Graphs
  • DFS, BFS
  • Minimum spanning tree (Prim's or Kruskal's algorithm)
Hashing
  • Using std::unordered_set and std::unordered_map

Computer Systems

Web Architecture

Four Key Principles

  • Availability
  • Performance
  • Reliability
  • Scalability
  • Manageability
  • Cost

Services

  • Separate each functionality into its own service so each can be scaled separately
(e.g. Reading and Writing can be two services which access the same file store)
  • Each service can then be managed separately.

Redundency

  • Replicate nodes for each service using a "shared-nothing architecture."
  • Each node should be able to operate independently.

Partitions

  • Also known as shards
  • Scaling vertically: Add more hard drives, memory
  • Scaling Horizontally: Add more nodes
  • Distribute data somehow: geographically, by type of user,...
  • Make sure you can identify server from id of data (e.g. adding it to hashing or maintaining an index)

Caches

  • Place a cache on the request layer.
  • Cache in memory, not disk (e.g. memcached, redis)
Global Caches
  • If you have several request nodes, you can put a global cache between the request node and the database
Distributed Cache
  • Each request node holds a cache for a specific type of data
  • A request node may forward requests to other request nodes which have the corresponding cache
  • Cons: May be hard to remedy if a node goes down

Proxies

  • Used to filter, log, and modify requests
  • Collapsed Forwarding: collapse the same or similar requests and return the same result to clients
  • You can also collapse requests that are spatially close together on the database

Indexes

  • Indexes are used to quickly find small data in large datasets
  • Indexes are stored in memory, data separated across several servers
  • Possibly many layers of indexes

Load Balancers

  • Purpose: Handle many connections and route them to request nodes
  • Algorithms: Random, round robin, custom criteria
  • Software: HAProxy
  • Usually placed at the front of a distributed system
    • Load balancers can send requests to other load balancers
  • Challenges: Managing user session data
    • Solutions: Caches, cookies, user data, URL rewriting
  • Cons: Makes problem diagnosis difficult

Queues

  • Schedule client tasks in a queue, acknowledge task received immediately
  • Queues protect from service outages and failures
  • Software: RabbitMQ, ActiveMQ

System Design

Misc