lec14
dz / distributed_systems_MIT / lec14Summary
Farm, OCC
Node Tree
-
farm
- vs_spanner
- OCC
- RDMA_nics
- bottlenecks
- commit_protocol
- farm_api
- forced_occ
- high_performance
- network_cpu_bottleneck
- research_prototype
- same_datacenter
- server_memory_layout
- sharded_primary_backup_pairs
Nodes
| RDMA_nics | |
| content | RDMA nics |
| children | RDMA, clever_network_interface_card, firmware_only, forced_occ (RDMA NICs are the reason for using OCC), sequence_protocol |
| parents | farm |
| vs_spanner | |
| content | vs spanner |
| children | both_2pc, geographic_in_repl, good_performance, ro_trans_sync_time |
| parents | farm |
| both_2pc | |
| content | Both use two-phase commit |
| parents | vs_spanner |
| geographic_in_repl | |
| content | Spanner is geographic in replication |
| parents | vs_spanner |
| ro_trans_sync_time | |
| content | Uses read-only transactions using synchronized time |
| parents | vs_spanner |
| bottlenecks | |
| content | Bottlenecks |
| children | speed_of_light, cpu_time |
| parents | farm |
| same_datacenter | |
| content | same datacenter |
| parents | farm |
| research_prototype | |
| content | research prototype |
| parents | farm |
| forced_occ | |
| content | Forced to use OCC (optimistic concurrency control) |
| parents | RDMA_nics, farm |
| good_performance | |
| content | good performance |
| parents | vs_spanner |
| speed_of_light | |
| content | Speed of Light |
| parents | bottlenecks |
| cpu_time | |
| content | CPU time |
| parents | bottlenecks |
| sharded_primary_backup_pairs | |
| content | Sharded on primary backup pairs |
| parents | farm |
| high_performance | |
| content | Ways farm gets high performance |
| children | transaction_code, NVRAM, RDMA, data_fits_RAM, kernel_bypass, sharding (main way farm gets high performance) |
| parents | farm |
| sharding | |
| content | Sharding |
| parents | high_performance |
| data_fits_RAM | |
| content | Data fits in RAM |
| children | NVRAM |
| parents | high_performance |
| remarks | much faster than disk |
| transaction_code | |
| content | Transaction code |
| parents | high_performance |
| RDMA | |
| content | RDMA |
| children | LAN_only, clever_network_interface_card, one_sided_RDMA, remote_direct_memory_access (Acronym) |
| parents | RDMA_nics, high_performance |
| kernel_bypass | |
| content | Kernal Bypass |
| children | skip_stack, DMA_in_app_memory, app_code_acces_nic_without_kernel (description) |
| parents | high_performance |
| clever_network_interface_card | |
| content | Clever network interface card (NIC) |
| parents | RDMA_nics, RDMA |
| NVRAM | |
| content | Non-volatile RAM (NVRAM) |
| children | multiple_servers_write_ram_enough, only_works_for_power_fail |
| parents | data_fits_RAM, high_performance |
| app_code_acces_nic_without_kernel | |
| content | Applicaiton code can directly access network card without kernel |
| parents | kernel_bypass |
| multiple_servers_write_ram_enough | |
| content | Is it enough to simply write to the RAM of multiple servers? |
| children | site_wide_power_failure (No, a sitewide power failure will wipe it all out) |
| parents | NVRAM |
| site_wide_power_failure | |
| content | A site-wide power failure will lose data |
| children | battery_system (prevantative measure against power failures) |
| parents | multiple_servers_write_ram_enough |
| battery_system | |
| content | Battery System |
| children | alert_system |
| parents | site_wide_power_failure |
| alert_system | |
| content | Alert System |
| children | server_saves_to_disk (on alert) |
| parents | battery_system |
| server_saves_to_disk | |
| content | Server saves RAM to disk |
| parents | alert_system |
| only_works_for_power_fail | |
| content | Only works for power failure crash |
| parents | NVRAM |
| network_cpu_bottleneck | |
| content | Network CPU bottlenecks |
| children | classic_network_stack_too_slow |
| parents | farm |
| classic_network_stack_too_slow | |
| content | Classic Network Stack too slow for RPCs. |
| children | classic_network_stack_top_down |
| parents | network_cpu_bottleneck |
| classic_network_stack_top_down | |
| content | Classic Network stack order: app, buffer, TCP, NIC driver, DMA, NIC |
| children | skip_stack |
| parents | classic_network_stack_too_slow |
| skip_stack | |
| content | Skip stack |
| parents | classic_network_stack_top_down, kernel_bypass |
| DMA_in_app_memory | |
| content | DMA is directly in application memory |
| children | app_takes_tcp_responsibilities |
| parents | kernel_bypass |
| app_takes_tcp_responsibilities | |
| content | Because it skips TCP, application takes on some TCP responsibilities |
| children | sequence_protocol (NIC handles this too) |
| parents | DMA_in_app_memory |
| remote_direct_memory_access | |
| content | Remote Direct Memory Access |
| parents | RDMA |
| firmware_only | |
| content | Firmware only: computer OS doesn't know about read/writes |
| parents | RDMA_nics |
| sequence_protocol | |
| content | Run their own reliable sequence protocol, similar to TCP |
| parents | RDMA_nics, app_takes_tcp_responsibilities |
| LAN_only | |
| content | LAN only |
| parents | RDMA |
| one_sided_RDMA | |
| content | One-sided RDMA |
| children | transactions_with_only_one_sided, execute_one_sided_read, one_app_RDMA_another_RDMA (description) |
| parents | RDMA |
| one_app_RDMA_another_RDMA | |
| content | One app uses RDMA to read/write RDMA of another app |
| children | append_to_log_op (the typical operation for one-sided RDMA in Farm) |
| parents | one_sided_RDMA |
| append_to_log_op | |
| content | appends to log |
| parents | one_app_RDMA_another_RDMA |
| transactions_with_only_one_sided | |
| content | Can you implemented transactions with only one-sided RDMA? |
| children | farm_suggests_no (still a question to think about though) |
| parents | one_sided_RDMA |
| farm_suggests_no | |
| content | Farm would suggest the answer would be "no" |
| parents | transactions_with_only_one_sided |
| OCC | |
| content | Optimistic Concurrency Control (OCC) |
| children | version_lockbits_enforce_serializability, buffer_writes_locally, check_later_if_reads_okay, commit_then_validate |
| parents | farm |
| buffer_writes_locally | |
| content | Buffer Writes Locally |
| parents | OCC |
| check_later_if_reads_okay | |
| content | Check later if reads are okay |
| parents | OCC |
| commit_then_validate | |
| content | commit then validate |
| children | validation, abort_on_conflicts |
| parents | OCC |
| abort_on_conflicts | |
| content | Abort on conflicts |
| children | exponential_backup |
| parents | commit_then_validate |
| validation | |
| content | Validation |
| children | optimize_for_reads, refetch_object_header |
| parents | commit_then_validate |
| farm_api | |
| content | API |
| children | txcommit, txcreate, txread, txwrite, OID |
| parents | farm |
| txcreate | |
| content | txCreate() |
| children | creates_transaction |
| parents | farm_api |
| txread | |
| content | txRead() |
| children | OID (input argument) |
| parents | farm_api |
| OID | |
| content | Object ID (OID) |
| children | compound_identifier |
| parents | txread, txwrite, farm_api |
| creates_transaction | |
| content | Creates Transaction |
| parents | txcreate |
| txwrite | |
| content | txWrite() |
| children | OID (input argument) |
| parents | farm_api |
| exponential_backup | |
| content | Exponential backup maybe used? |
| parents | abort_on_conflicts |
| compound_identifier | |
| content | Compound Identifier |
| children | address, region_num |
| parents | OID |
| region_num | |
| content | Region Number |
| parents | compound_identifier |
| address | |
| content | Address |
| parents | compound_identifier |
| server_memory_layout | |
| content | Server Memory Layout |
| children | logs_for_each_server, pair_msg_queues, region |
| parents | farm |
| region | |
| content | Region |
| children | versioned_objects |
| parents | server_memory_layout |
| versioned_objects | |
| content | Versioned Objects |
| children | version_num, lock_flag |
| parents | region |
| version_num | |
| content | version number |
| parents | versioned_objects |
| lock_flag | |
| content | Lock flag |
| parents | versioned_objects |
| pair_msg_queues | |
| content | Pair of Message Queues |
| parents | server_memory_layout |
| logs_for_each_server | |
| content | Logs, one for each of the other servers |
| parents | server_memory_layout |
| commit_protocol | |
| content | Commit Protocol |
| children | execute_phase |
| parents | farm |
| execute_phase | |
| content | Execute Phase |
| children | txcommit_call, reads_everything_needed |
| parents | commit_protocol |
| reads_everything_needed | |
| content | Reads everything it needs |
| parents | execute_phase |
| txcommit_call | |
| content | txcommit call |
| children | commit_phase (happens when all yes) |
| parents | txcommit, execute_phase |
| txcommit | |
| content | txCommit |
| children | txcommit_call |
| parents | farm_api |
| commit_phase | |
| content | commit phase |
| children | trans_coord_all_yes, lock_phase |
| parents | txcommit_call |
| lock_phase | |
| content | Lock Phase |
| children | trans_coord_all_yes, send_object_id |
| parents | commit_phase |
| send_object_id | |
| content | client sends each primary server identity of udpated object |
| children | append_to_log |
| parents | lock_phase |
| trans_coord_all_yes | |
| content | Tranasaction coordinator notifies primary servers "all yes" |
| children | append_to_prim |
| parents | commit_phase, lock_phase |
| append_to_log | |
| content | Append to log |
| children | prim_active_log_process |
| parents | send_object_id |
| prim_active_log_process | |
| content | Primaries actively process new logs, and send yes/no vote |
| children | version_changed, is_object_already_locked |
| parents | append_to_log |
| is_object_already_locked | |
| content | is object already locked? |
| parents | prim_active_log_process |
| version_changed | |
| content | has the version number changed? |
| children | atomic_compare_and_swap |
| parents | prim_active_log_process |
| atomic_compare_and_swap | |
| content | Atomic compare_and_swap |
| children | multithread_race_transactions (rationale for atomic operation) |
| parents | version_changed |
| multithread_race_transactions | |
| content | Multithreading can cause races between transactions |
| parents | atomic_compare_and_swap |
| append_to_prim | |
| content | append to primary log |
| children | commit_prim_record |
| parents | trans_coord_all_yes |
| commit_prim_record | |
| content | commit primary record |
| children | update_object_version_clear_lock_bit |
| parents | append_to_prim |
| update_object_version_clear_lock_bit | |
| content | Update object and version number, clear lock bit |
| parents | commit_prim_record |
| version_lockbits_enforce_serializability | |
| content | Version numbering and lock bits enforce serializability in OCC |
| parents | OCC |
| optimize_for_reads | |
| content | Optimization to treat objects read by transactions, not written |
| children | straight_ro_transaction |
| parents | validation |
| refetch_object_header | |
| content | Refetch object header |
| children | check_versions_locks |
| parents | validation |
| check_versions_locks | |
| content | Checks for version changes since start and if the lock bit is set |
| parents | refetch_object_header |
| straight_ro_transaction | |
| content | Straight read-only transaction |
| children | execute_one_sided_read, ro_valid_optimizer |
| parents | optimize_for_reads |
| execute_one_sided_read | |
| content | Execute with fast one-sided read |
| parents | straight_ro_transaction, one_sided_RDMA |
| ro_valid_optimizer | |
| content | read-only validation optimizer |
| parents | straight_ro_transaction |