lec03
dz / distributed_systems_MIT / lec03Node Tree
- log_better
- weak_consistency
- GFS
Nodes
| GFS | |
| content | GFS (aka Google File System) |
| children | files_autosplit, high_speeds_parallel, internal_use, record_append, single_data_center, single_master, why_hard, GFS_goals, auto_failure_recovery, big_sequence, big_storage |
| big_storage | |
| content | Big Storage |
| parents | GFS |
| why_hard | |
| content | Why is Big Storage hard? |
| children | faults, performance |
| parents | GFS |
| performance | |
| content | Performance |
| children | sharding (I forget what sharding has to do with performance) |
| parents | why_hard |
| faults | |
| content | Faults |
| children | tolerance |
| parents | why_hard |
| sharding | |
| content | Sharding |
| children | shard |
| parents | performance |
| tolerance | |
| content | Fault Tolerance |
| children | replication |
| parents | faults |
| replication | |
| content | Replication as a means to add fault tolerance |
| children | almost_identical |
| parents | tolerance |
| almost_identical | |
| content | "Almost Identical" inconsistency risk |
| children | consistency |
| parents | replication |
| consistency | |
| content | Consistency in replications |
| children | strong_consistency, bad_replication |
| parents | almost_identical |
| strong_consistency | |
| content | A strongly consistent system will be identical when duplicating data |
| children | low_performance_tradeoff, not_strongly_consistent (NOT strongly consistent), system_behaves |
| parents | consistency |
| low_performance_tradeoff | |
| content | A strongly consistent system has a low performance cost as a tradeoff. |
| parents | strong_consistency |
| system_behaves | |
| content | A strongly consistent system behaves just like it was one server |
| parents | strong_consistency |
| bad_replication | |
| content | Bad Replication Design |
| children | events_order |
| parents | consistency |
| events_order | |
| content | no way to ensure events (writes/reads) processed in correct order |
| parents | bad_replication |
| GFS_goals | |
| content | GFS Goals: Big, Fast, Global |
| parents | GFS |
| high_speeds_parallel | |
| content | High Speeds, Parallel Access |
| parents | GFS |
| files_autosplit | |
| content | Files Automatically Split |
| children | shard (One of the splits of a file is called a "shard") |
| parents | GFS |
| shard | |
| content | Shard |
| children | chunk_server (Shards and chunks may be analogous) |
| parents | files_autosplit, sharding |
| auto_failure_recovery | |
| content | Automatic Failure Recovery |
| parents | GFS |
| single_data_center | |
| content | Single Data Center |
| parents | GFS |
| big_sequence | |
| content | Designed for big sequential reads/writes |
| parents | GFS |
| remarks | As opposed to random reads/reads |
| internal_use | |
| content | Used internally by Google |
| parents | GFS |
| weak_consistency | |
| content | Designed with weak consistency |
| children | nature_of_gfs (I think this is what is meant by weak consistency here?), not_strongly_consistent |
| remarks | Heretical to use weak consistency for academics |
| single_master | |
| content | Single Master |
| children | chunk_server (Master knows which chunks are stored on which chunk,servers), master_data |
| parents | GFS |
| chunk_server | |
| content | Chunk Server stores the actually chunks |
| parents | single_master, shard |
| remarks | Are "chunks" the same thing as shards? |
| master_data | |
| content | Master Data |
| children | filename, handle, log_checkpoint |
| parents | single_master |
| filename | |
| content | Filename |
| children | nv |
| parents | master_data |
| nv | |
| content | Non-volatile storage |
| parents | filename |
| handle | |
| content | Handle |
| parents | master_data |
| log_checkpoint | |
| content | log, checkpoint |
| children | disk_storage |
| parents | master_data |
| disk_storage | |
| content | Stored to Disk |
| parents | log_checkpoint |
| log_better | |
| content | Log is better than something like database or b-tree because it is more efficient |
| record_append | |
| content | How a record is appened in GFS |
| children | client_data_ps, last_chunk |
| parents | GFS |
| last_chunk | |
| content | Where is the last chunk? |
| children | ask_master |
| parents | record_append |
| ask_master | |
| content | Ask the Master server |
| children | no_primary, primary_dead |
| parents | last_chunk |
| no_primary | |
| content | No Primary? |
| children | find_replicate |
| parents | ask_master |
| find_replicate | |
| content | Find an up-to-date replicate |
| children | pick_primary |
| parents | no_primary |
| pick_primary | |
| content | Picks Primary |
| children | version_bumped |
| parents | find_replicate |
| version_bumped | |
| content | Version Bumped |
| children | tells_primary_secondary |
| parents | pick_primary |
| tells_primary_secondary | |
| content | Tells Primary and Secondary Replicates to Master |
| children | lease |
| parents | version_bumped |
| lease | |
| content | Leased on Primary: "you are primary for 60s" |
| children | primary_dead (this is what the lease helps with), split_brain_solution |
| parents | tells_primary_secondary |
| client_data_ps | |
| content | Client sends copy of data to Primary and Secondary |
| children | primary_offset |
| parents | record_append |
| primary_offset | |
| content | Primary Picks Offset |
| children | replicas_write_to_off |
| parents | client_data_ps |
| replicas_write_to_off | |
| content | All replicas told to write the data to that offset |
| children | all_replicas_ok |
| parents | primary_offset |
| all_replicas_ok | |
| content | If all replicas reply back "yes", all okay |
| children | what_if_some_append |
| parents | replicas_write_to_off |
| what_if_some_append | |
| content | What if only some append? |
| children | nature_of_gfs, records_different_order |
| parents | all_replicas_ok |
| nature_of_gfs | |
| content | things sometimes not appending is just the nature of GFS |
| parents | what_if_some_append, weak_consistency |
| records_different_order | |
| content | Records in replicas can be in different orders |
| parents | what_if_some_append |
| primary_dead | |
| content | What if Master server thinks the Primary is dead? |
| children | master_doesnt_pick, two_primaries |
| parents | lease, ask_master |
| two_primaries | |
| content | Two primaries in a system is known as "split brain" |
| children | master_doesnt_pick (Otherwise, you end up causing "Split Brain"), network_partition, split_brain_solution |
| parents | primary_dead |
| network_partition | |
| content | split brain can be caused by a network partition where parts of the network can transmit but maybe not receive |
| parents | two_primaries |
| split_brain_solution | |
| content | The solution to Split Brain (two primaries) is to use a lease on a primary. After the lease is up, commands are no longer sent to that primary. |
| parents | lease, two_primaries |
| master_doesnt_pick | |
| content | The Master should NOT designate a primary |
| parents | two_primaries, primary_dead |
| two_phase_commit | |
| content | Two-Phase Commit. A mechanism for strong consistency |
| parents | not_strongly_consistent, extra_bits |
| not_strongly_consistent | |
| content | GFS is not strongly consistent |
| children | extra_bits, two_phase_commit |
| parents | strong_consistency, weak_consistency |
| extra_bits | |
| content | GFS would need "extra bits" for strong consistency |
| children | two_phase_commit (One of the things you'd add to GFS to make it,strongly consistent) |
| parents | not_strongly_consistent |