9.8. Further Reading¶
A Distributed Graph-Theoretic Framework for Automatic Parallelization in Multi-Core Systems 1
SCOP: Scientific Control for Reliable Neural Network Pruning 2
Searching for Low-Bit Weights in Quantized Neural Networks 3
GhostNet: More Features from Cheap Operations 4
AdderNet: Do We Really Need Multiplications in Deep Learning? 5
Blockwise Parallel Decoding for Deep Autoregressive Models 6
Medusa: Simple framework for accelerating LLM generation with multiple decoding heads 7
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning 8