lec04
dz / distributed_systems_MIT / lec04Node Tree
- fault_tolerance
- logging_channel
- depends
Nodes
| fault_tolerance | |
| content | Fault Tolerance |
| children | replication (Tool Used For Fault Tolerance), vmware_ft |
| replication | |
| content | Replication |
| children | limits_to, replication_schemes, worth_it, expected_failures |
| parents | fault_tolerance |
| expected_failures | |
| content | Expected Failures To Address |
| children | fail_stop_faults |
| parents | replication |
| fail_stop_faults | |
| content | Fail Stop Faults: Stops Computing Entirely |
| children | hardware_errors |
| parents | expected_failures |
| limits_to | |
| content | Limits To Replication (Not Covered) |
| children | software_bugs, correlated_failures |
| parents | replication |
| software_bugs | |
| content | Bugs in Software |
| parents | limits_to |
| vmware_ft | |
| content | VMWare FT. This lecture studies this particular replication design. |
| children | full_state_detailed (This is the approach that VMWare FT uses, which makes,it unique.), output, primary_fails, timer_exact, unicore_processor, vmm |
| parents | fault_tolerance |
| hardware_errors | |
| content | Hardware errors can be turned into fault errors sometimes. The advantage of this is that these errors can be detectable. |
| parents | fail_stop_faults |
| correlated_failures | |
| content | Correlated failures include hardware defects (such as from defective batch of servers from a single company), and natural disasters like earthquakes. |
| children | physical_separation |
| parents | limits_to |
| worth_it | |
| content | Is replication worth it? |
| parents | replication |
| depends | |
| content | Depends on value of a reliable service |
| physical_separation | |
| content | Physical separtion (different countries) |
| parents | correlated_failures |
| state_transfer | |
| content | State Transfer |
| children | smaller_operations (more favorable than state transfer), whole_state |
| parents | replication_schemes |
| replication_schemes | |
| content | Replication Schemes |
| children | replicated_state_machine, state_transfer |
| parents | replication |
| replicated_state_machine | |
| content | Replicated State Machine |
| children | internal_deterministic, smaller_operations (This is a "pro" for using RSMs over), designing_rsm |
| parents | replication_schemes |
| whole_state | |
| content | Sends whole state of primary |
| children | just_send_external (Sending external events typically means sending less) |
| parents | state_transfer |
| internal_deterministic | |
| content | Works on the assumption that most internal operations of a CPU are deterministic |
| children | unicore_processor (single-core instructions are determinstic) |
| parents | replicated_state_machine |
| just_send_external | |
| content | Just send external events (input events, packets, etc) |
| children | nondeterministic_events (External events are the non-deterministic events) |
| parents | whole_state |
| smaller_operations | |
| content | RSMs tend to have smaller operations (compared to state transfer), tends to be more favorable |
| children | ops_more_complex (Potential downside of RSMs) |
| parents | state_transfer, replicated_state_machine |
| ops_more_complex | |
| content | Operations in RSMs tend to be more complex |
| parents | smaller_operations |
| unicore_processor | |
| content | VMWare FT replication works on unicore processors |
| children | multicore_nondeterministic (multicore unable to be used with this replication scheme) |
| parents | internal_deterministic, vmware_ft |
| multicore_nondeterministic | |
| content | multicore processors can't be used because the way instructions are interleaved makes them non-deterministic |
| children | multicore_parallelism |
| parents | unicore_processor |
| flashcard (front) | Why can't multicore processors be used in the VMWare FT Replication scheme? |
| flashcard (back) | The way multicore processors interleave instructions makes them non-deterministic and therefore unsuitable for the VMware FT replication scheme. |
| level_of_replication | |
| content | What level of replication should be used? |
| children | full_state_detailed |
| parents | designing_rsm |
| designing_rsm | |
| content | Designing a Replicated State Machine (RSM) |
| children | how_close_is_sync, level_of_replication, new_replica_expensive |
| parents | replicated_state_machine |
| flashcard (front) | What does RSM stand for? |
| flashcard (back) | Replicated State Machine. |
| how_close_is_sync | |
| content | How close is synchronization? (between primary/backup) |
| children | sync_ideal |
| parents | designing_rsm |
| sync_ideal | |
| content | Ideal Synchronization: if primary fails, switch over to backup with no anomalies. |
| parents | how_close_is_sync |
| remarks | this never actually happens in practice, anomalies do occur |
| new_replica_expensive | |
| content | Creation of a new replica is expensive |
| children | full_state_detailed |
| parents | designing_rsm |
| full_state_detailed | |
| content | Copying full State of machine (registers, memory) is very detailed |
| children | application_level (more efficient than machine-level replication) |
| parents | level_of_replication, new_replica_expensive, vmware_ft |
| application_level | |
| content | Most replication schemes are application-level |
| children | replication_application |
| parents | full_state_detailed |
| replication_application | |
| content | Replication needs to be a part of the application in order to work. |
| children | existing_software (Existing software runs on top of machine and can work,without modification or any knowledge of replication.) |
| parents | application_level |
| existing_software | |
| content | Existing software will work as-is using machine-level replication. |
| parents | replication_application |
| multicore_parallelism | |
| content | Multicore Parallelism is not covered |
| parents | multicore_nondeterministic, nondeterministic_events |
| nondeterministic_events | |
| content | Examples of non-deterministic events |
| children | inputs, multicore_parallelism |
| parents | just_send_external |
| inputs | |
| content | Inputs are the most common non-deterministic event |
| children | network_packets |
| parents | nondeterministic_events |
| network_packets | |
| content | Inputs in this scope are just network packets |
| children | data_interrupt |
| parents | inputs |
| data_interrupt | |
| content | When a packet arrives, the data in the packet, and the interrupt type is stored. |
| children | timing_interrupt |
| parents | network_packets |
| timing_interrupt | |
| content | The timing of the interrupt (where it is in the instruction set) must be identical. |
| parents | data_interrupt |
| vmm | |
| content | Virtual Machine Monitor |
| children | packet_sends_vm_backup |
| parents | vmware_ft |
| packet_sends_vm_backup | |
| content | Network packets, sends to the VM, then sends a version of the packet to the backup |
| children | primary_outputs_only |
| parents | vmm |
| primary_outputs_only | |
| content | Both primary and backup see inputs, primary is the only one that outputs. |
| parents | packet_sends_vm_backup |
| logging_channel | |
| content | Logging Channel: stream of events. |
| children | log_entry_format, only_weird_instructions, arriving_packets |
| remarks | Context: sending "Log events on the log channel" |
| primary_fails | |
| content | What if the primary fails? |
| children | backup_stops_logs |
| parents | vmware_ft |
| backup_stops_logs | |
| content | Indicator that primary fails is if the backup stops getting logs from the primary. |
| children | backup_goes_live |
| parents | primary_fails |
| remarks | Apparently logs get sent quite frequently to the backup (many times a second). Some kind of "heartbeat" or timing interrupt? I forget the exact terminology |
| backup_goes_live | |
| content | The Backup Goes "Live" |
| children | vm_allows_backup_to_run |
| parents | backup_stops_logs |
| vm_allows_backup_to_run | |
| content | The VM allows the backup to run. The backup then stops discarding output. |
| parents | backup_goes_live |
| only_weird_instructions | |
| content | Only "weird" instructions get sent to the log channel |
| parents | logging_channel |
| log_entry_format | |
| content | Format of a log entry |
| children | interrupt_type, log_entry_data |
| parents | logging_channel |
| remarks | They don't explicitely say what the format of a log entry is in the paper. |
| interrupt_type | |
| content | Interrupt Type |
| parents | log_entry_format |
| remarks | I just wrote "type", but I'm assuming it's interrupt type |
| log_entry_data | |
| content | Data (from network packet) |
| parents | log_entry_format |
| timer_exact | |
| content | Assumes VM has timer in exactly the same place for both the Primary and Backup |
| children | physical_timer_to_guest, backup_gets_ahead |
| parents | vmware_ft |
| physical_timer_to_guest | |
| content | Physical timer interrupts are sent to guest |
| parents | timer_exact |
| arriving_packets | |
| content | Arriving Packets |
| children | NICS_DMA |
| parents | logging_channel |
| NICS_DMA | |
| content | Some NICS use DMA (direct memory access) in their implementation. |
| children | primary_no_DMA |
| parents | arriving_packets |
| primary_no_DMA | |
| content | Primary cannot directly access NIC and the DMA directly |
| children | private_mem |
| parents | NICS_DMA |
| private_mem | |
| content | Events from NIC are DMA'd into private memory in VM, then they are copied over to the primary |
| children | bounce_buffer ("Bounce Buffer" is the term for what this does) |
| parents | primary_no_DMA |
| bounce_buffer | |
| content | Bounce Buffer |
| parents | private_mem |
| backup_gets_ahead | |
| content | What if backup gets ahead of primary execution? This can't ever happen. |
| children | event_buffer_nonempty (Event buffer is used to prevent backup from getting ahead) |
| parents | timer_exact |
| event_buffer_nonempty | |
| content | Event buffer: VM only executes instructions if non-empty |
| parents | backup_gets_ahead |
| output | |
| content | Handling output events |
| children | network_packets_only, awkward_failures |
| parents | vmware_ft |
| network_packets_only | |
| content | In this context, the only thing being output are network packets |
| parents | output |
| awkward_failures | |
| content | What are the kinds of awkward failures that could happen? |
| children | network_split_brain (example of failure), output_rules, test_and_set (Prevantative Solution) |
| parents | output |
| output_rules | |
| content | Output Rules Preventative Measures against certain kinds of failures |
| children | output_waits_for_backup (This prevents issues related to backup not receiving,network packets over log channel) |
| parents | awkward_failures |
| output_waits_for_backup | |
| content | Output can't produce any output until backup receives all previous events to this point in time. |
| parents | output_rules |
| test_and_set | |
| content | Test And Set: an outside authority that deices which machine (primary/backup) can be "live" |
| children | acts_like_lock, network_split_brain ("Test and Set" server used to solve this) |
| parents | awkward_failures |
| network_split_brain | |
| content | Network Issues can cause split brain |
| parents | awkward_failures, test_and_set |
| acts_like_lock | |
| content | Test/Set server acts like a lock. The primary/secondary send requests to this server to get write permission, which in turn set a flag on the Test/Set server. |
| parents | test_and_set |