Technical Interviews: Difference between revisions

 
(11 intermediate revisions by the same user not shown)
Line 5: Line 5:
==Algorithms==
==Algorithms==
{{main | Interview Algorithms}}
{{main | Interview Algorithms}}
In general, consider the following approaches
* Brute-force and searching
* Dynamic Programming
* Greedy (sorting + take the best at each step)
* Linear searching (with 2 pointers or sliding window)
** Usually the answer if you're searching for a connected segment along an array or string
** E.g. max contiguous subarray
Be familiar with common algorithms and data structures:
;Sorting
* Merge sort
* Quicksort
* Quickselect
* Counting sort
* Radix sort
* Binary search
;Linked-lists
* Doubly linked list, circular linked list
;Trees
* Preorder, inorder, postorder traversal
* AVL rotations
* Union-Find
* Heaps
* Using <code>std::set</code>, <code>std::map</code>, <code>std::priority_queue</code>
;Graphs
* DFS, BFS
* Minimum spanning tree (Prim's or Kruskal's algorithm)
;Hashing
* Using <code>std::unordered_set</code> and <code>std::unordered_map</code>


==Computer Systems==
==Computer Systems==
===Web Architecture===
===Web Architecture===
* [https://www.aosabook.org/en/distsys.html Scalable Web Architecture and Distributed Systems]
* [https://www.aosabook.org/en/distsys.html Scalable Web Architecture and Distributed Systems]
Four Key Principles
* Availability
* Performance
* Reliability
* Scalability
* Manageability
* Cost
====Services====
* Separate each functionality into its own service so each can be scaled separately
: (e.g. Reading and Writing can be two services which access the same file store)
* Each service can then be managed separately.
====Redundency====
* Replicate nodes for each service using a "shared-nothing architecture."
* Each node should be able to operate independently.
====Partitions====
* Also known as shards
* Scaling vertically: Add more hard drives, memory
* Scaling Horizontally: Add more nodes
* Distribute data somehow: geographically, by type of user,...
* Make sure you can identify server from id of data (e.g. adding it to hashing or maintaining an index)
====Caches====
* Place a cache on the request layer.
* Cache in memory, not disk (e.g. [http://memcached.org/ memcached], redis)
; Global Caches
* If you have several request nodes, you can put a global cache between the request node and the database
; Distributed Cache
* Each request node holds a cache for a specific type of data
* A request node may forward requests to other request nodes which have the corresponding cache
* Cons: May be hard to remedy if a node goes down
====Proxies====
* Used to filter, log, and modify requests
* Collapsed Forwarding: collapse the same or similar requests and return the same result to clients
* You can also collapse requests that are spatially close together on the database
====Indexes====
* Indexes are used to quickly find small data in large datasets
* Indexes are stored in memory, data separated across several servers
* Possibly many layers of indexes
====Load Balancers====
* Purpose: Handle many connections and route them to request nodes
* Algorithms: Random, round robin, custom criteria
* Software: [https://www.haproxy.org/ HAProxy]
* Usually placed at the front of a distributed system
** Load balancers can send requests to other load balancers
* Challenges: Managing user session data
** Solutions: Caches, cookies, user data, URL rewriting
* Cons: Makes problem diagnosis difficult
====Queues====
* Schedule client tasks in a queue, acknowledge task received immediately
* Queues protect from service outages and failures
* Software: RabbitMQ, ActiveMQ


==System Design==
==System Design==
==Misc==
* [https://steve-yegge.blogspot.com/2008/03/get-that-job-at-google.html Steve Yegge Get that job at Google]