Shopping Cart

No products in the cart.

IEEE 2941-2021

$93.71

IEEE Standard for Artificial Intelligence (AI) Model Representation, Compression, Distribution, and Management

Published By Publication Date Number of Pages
IEEE 2021 226
Guaranteed Safe Checkout
Category:

If you have any questions, feel free to reach out to our online customer service team by clicking on the bottom right corner. We’re here to assist you 24/7.
Email:[email protected]

New IEEE Standard – Active. The AI development interface, AI model interoperable representation, coding format, and model encapsulated format for efficient AI model inference, storage, distribution, and management are discussed in this standard.

PDF Catalog

PDF Pages PDF Title
1 IEEE Std 2941-2021 Front Cover
2 Title page
4 Important Notices and Disclaimers Concerning IEEE Standards Documents
Notice and Disclaimer of Liability Concerning the Use of IEEE Standards Documents
5 Translations
Official statements
Comments on standards
Laws and regulations
Data privacy
Copyrights
6 Photocopies
Updating of IEEE Standards documents
Errata
Patents
IMPORTANT NOTICE
7 Participants
Participants
8 Introduction
9 Contents
11 1. Overview
1.1 Scope
1.2 Purpose
1.3 Word usage
12 2. Normative references
3. Definitions, acronyms, and abbreviations
3.1 Definitions
14 3.2 Acronyms and abbreviations
4. Symbols and operators
4.1 Arithmetic operators
15 4.2 Logical operator
4.3 Relational operators
16 4.4 Bitwise operators
4.5 Assignment operators
5. Framework of convolutional neural network representation and model compression
17 6. Syntax and semantics of neural network models
6.1 Data structure
6.1.1 Data structure of neural network structure
18 6.1.2 Data structure of neural network parameters
19 6.2 Syntax description
6.2.1 Overview
20 6.2.2 Definition of model structure
6.2.3 Definition of contributor list
6.2.4 Definition of computational graph
21 6.2.5 Definition of operator node
6.2.6 Definition of variable node
6.2.7 Definition of attribute
22 6.2.8 Definition of other type
6.2.9 Definition of tensor type
23 6.2.10 Definition of tensor
6.2.11 Definition of tensor size
24 6.2.12 Definition of dimension
6.3 Semantic description
71 6.4 Definition of training operator
6.4.1 Loss function
73 6.4.2 Definition of inverse operator
75 7. Compression process
7.1 Multiple models
7.1.1 Definition of multiple models technology
76 7.1.2 Compression of multiple models
77 7.1.3 Shared compression operator for weights of multiple model layers
7.1.3.1 Definition
7.1.3.2 Weight aggregation
79 7.1.4 Residual quantization compression
7.1.4.1 Definition of residual quantization for multiple models
7.1.4.2 Weight sharing
81 7.2 Quantization
7.2.1 Definition
82 7.2.2 Basic quantization operator
83 7.2.2.1 Linear quantization
84 7.2.2.2 Codebook quantization
86 7.2.3 Parameter quantization operator
7.2.3.1 Nonlinear function mapping
88 7.2.3.2 INT4 parameter quantization
90 7.2.3.3 Parameter quantization for bounded ReLU
91 7.2.4 Activate quantization operator
7.2.4.1 Trainable alpha quantization
7.2.4.2 INT4 activation quantization
93 7.2.4.3 Activation quantization for bounded ReLU
95 7.2.4.4 Ratio synchronization quantization
97 7.3 Pruning
7.3.1 Overview
98 7.3.2 Pruning operator
100 7.4 Structured matrix
7.4.1 Structured matrix compression
101 7.4.2 Method for the compression of block circulant matrix with signed vectors
7.4.2.1 Block circulant matrix compression operator
102 7.4.2.2 Random vector dimension list and random vector generation operator
104 7.4.3 Method for the low-rank sparse decomposed structured matrix
7.4.3.1 Definition
7.4.3.2 Decomposition compression operator for the convolutional layers in low-rank sparse decomposed structured matrix
106 7.4.3.3 Compression operator of a fully connected or 1 × 1 convolutional layer in a low-rank sparse decomposed structured matrix
107 8. Decompression process
8.1 Multiple models
8.1.1 Decompression for multiple models
108 8.1.2 Decompression operator for weights of multiple model layers
8.1.2.1 Decompression for weights of multiple model layers
8.1.2.2 Decompression output multiple models
110 8.1.2.3 Decompression output specific model
111 8.1.2.4 Decompression output switched specific models
112 8.1.3 Decompression of residual quantization for multiple models
8.1.3.1 Definition of decompression
8.1.3.2 Decompression of the output target model
113 8.2 Dequantization
8.2.1 Definition
8.2.2 Basic dequantization operator
115 8.2.2.1 Linear dequantization
116 8.2.2.2 Codebook dequantization
117 8.2.3 Parameter dequantization operator
8.2.3.1 Nonlinear function mapping dequantization
118 8.2.3.2 INT4 parameter dequantization
119 8.2.4 Activate dequantization operator
8.2.4.1 Trainable alpha value dequantization
120 8.2.4.2 INT4 activation dequantization
121 8.3 Inverse sparsity/inverse pruning operator
8.3.1 Definition
122 8.3.2 Inverse sparsity
123 8.4 Structured matrix
8.4.1 Decompression of structured matrix
124 8.4.2 Method for the decompression of block circulant matrix with signed vectors
8.4.2.1 Block circulant matrix decompression operator
126 8.4.2.2 Disturbance vector generation operator
128 8.4.2.3 Operator on the layers using signed vector and block circulant matrix techniques
129 8.4.3 Methods for the decompression of low-rank sparse decomposed structured matrix
8.4.3.1 Overview
8.4.3.2 Decompression operator for low-rank sparse decomposed structured matrix
130 8.4.3.3 Decompression operator for the fully connected and 1 × 1 layers in low-rank sparse decomposed structured matrix
131 9. Data generation
9.1 Definition
9.2 Training data generation method
9.2.1 Method of generating training data based on real data
9.2.1.1 Overview
9.2.1.2 Data augmentation method
133 9.2.1.3 Generating data using the GAN
135 9.2.2 Data-free training data generation method
9.2.2.1 Overview
9.2.2.2 Generating training data using the GAN
138 9.3 Multiple models
9.3.1 Method for weight generation in multiple models
9.3.1.1 Multiple models weight update operator
139 9.3.1.2 Multiple models weight shared data generation approach 1
141 9.3.1.3 Multiple models weight shared data generation approach 2
143 9.3.2 Residual quantization training method for multiple models
144 9.4 Quantization
9.4.1 Parameter quantization
9.4.1.1 Data generation for INT4 parameter quantization
150 9.4.1.2 Interval shrinkage quantization data generation
154 9.4.2 Activate quantization
9.4.2.1 Data generation for INT4 activate quantization
158 9.4.2.2 Trainable alpha quantization training data generation
159 9.5 Pruning
9.5.1 Overview
160 9.5.2 Sparse data generation method
163 9.5.3 Incremental regularization pruning
167 9.6 Structured matrix
9.6.1 Data generation of structured matrix
168 9.6.2 Approach for generating data to be compressed in block circulant matrix with signed vectors
172 9.6.3 Approach for generating the weight in low-rank sparse decomposed structured matrix
9.6.3.1 Overview
173 9.6.3.2 Approaches for determining hyper-parameter R1, R2, groups and core_size
9.6.3.3 Process for the generation of weights of a low-rank sparse decomposed structured matrix
175 10. Compressed representation of neural network
10.1 Specification of syntax and semantics
180 10.2 Synatx
10.2.1 Neural network compression (NNC) bitstream syntax
181 10.2.2 NNC header syntax
182 10.2.3 NNC layer header syntax
183 10.2.4 NNC 1D array syntax
10.2.5 NNC CTU3D syntax
10.2.6 NNC CTU3D header syntax
184 10.2.7 NNC zdep_array syntax
185 10.2.8 NNC CU3D syntax
186 10.2.9 NNC predicted_codebook syntax
10.2.10 NNC sygnalled_codebook syntax
187 10.2.11 NNC unitree3d syntax
188 10.2.12 NNC octree3d syntax
190 10.2.13 NNC tagtree3d syntax
192 10.2.14 NNC uni_tagtree3d syntax
194 10.2.15 NNC escape syntax
10.3 Semantics
10.3.1 Initialization
10.3.2 NNC bitstream semantics
195 10.3.3 NNC header semantics
10.3.4 NNC layer header semantics
196 10.3.5 NNC 1D array semantics
10.3.6 NNC CTU3D semantics
10.3.7 NNC CU3D header semantics
10.3.8 NNC zdep_array semantics
10.3.9 NNC CU3D semantics
197 10.3.10 NNC predicted codebook semantics
10.3.11 NNC signaled codebook semantics
10.3.12 NNC unitree3d semantics
199 10.3.13 NNC octree3d semantics
200 10.3.14 NNC tagtree3d semantics
202 10.3.15 NNC uni_tagtree3d semantics
203 10.3.16 NNC escape semantics
204 10.4 Parsing process
10.4.1 Description
10.4.2 Initialization
10.4.2.1 Initialization of context model
10.4.2.2 Initialization of AEC decoder
10.4.3 Parsing binary string
10.4.3.1 Description
205 10.4.3.2 Determine ctxIdx
208 10.4.3.3 Parsing bins
10.4.3.3.1 Parsing process
10.4.3.3.2 decode_decision
209 10.4.3.3.3 decode_aec_stuffing_bit
10.4.3.3.4 decode_bypass
10.4.3.3.5 update_ctx
210 10.4.3.4 Binarization
10.4.3.4.1 Description
212 10.4.3.4.2 Binarization for fix length code (FL)
10.4.3.4.3 Binarization for unary code (U)
213 10.4.3.4.4 Binarization for truncated unary code (TU)
10.4.3.4.5 kth-order Exp-Golomb codes (EGk)
214 10.4.3.4.6 Joint truncated unary code and kth-order Exp-Golomb codes (UEGk)
215 10.5 Decoding process
10.5.1 General decoding process
10.5.2 Decoding NNC header
10.5.3 Decoding NNC layer header
10.5.4 Decoding NNC sublayer
217 10.5.5 Decoding 1D array
10.5.6 NNC CTU3D semantics
218 10.5.7 Decoding CTU3D
10.5.8 Decoding ZdepArray
10.5.9 Decoding CU3D
219 10.5.10 Decoding predicted codebook
10.5.11 Decoding signalled codebook
10.5.12 Decoding unitree3d
220 10.5.13 Decoding octree3d
10.5.14 Decoding tagtree3d
10.5.15 Decoding uni_tagtree3d
221 10.5.16 Decoding escape
222 11. Model protection
11.1 Model protection definition
223 11.2 Model encryption process
224 11.3 Model decryption process
225 11.4 Cipher model data structure definition
226 Back Cover
IEEE 2941-2021
$93.71