{"id":400862,"date":"2024-10-20T04:53:00","date_gmt":"2024-10-20T04:53:00","guid":{"rendered":"https:\/\/pdfstandards.shop\/product\/uncategorized\/ieee-2941-2021\/"},"modified":"2024-10-26T08:40:26","modified_gmt":"2024-10-26T08:40:26","slug":"ieee-2941-2021","status":"publish","type":"product","link":"https:\/\/pdfstandards.shop\/product\/publishers\/ieee\/ieee-2941-2021\/","title":{"rendered":"IEEE 2941-2021"},"content":{"rendered":"
New IEEE Standard – Active. The AI development interface, AI model interoperable representation, coding format, and model encapsulated format for efficient AI model inference, storage, distribution, and management are discussed in this standard.<\/p>\n
PDF Pages<\/th>\n | PDF Title<\/th>\n<\/tr>\n | ||||||
---|---|---|---|---|---|---|---|
1<\/td>\n | IEEE Std 2941-2021 Front Cover <\/td>\n<\/tr>\n | ||||||
2<\/td>\n | Title page <\/td>\n<\/tr>\n | ||||||
4<\/td>\n | Important Notices and Disclaimers Concerning IEEE Standards Documents Notice and Disclaimer of Liability Concerning the Use of IEEE Standards Documents <\/td>\n<\/tr>\n | ||||||
5<\/td>\n | Translations Official statements Comments on standards Laws and regulations Data privacy Copyrights <\/td>\n<\/tr>\n | ||||||
6<\/td>\n | Photocopies Updating of IEEE Standards documents Errata Patents IMPORTANT NOTICE <\/td>\n<\/tr>\n | ||||||
7<\/td>\n | Participants Participants <\/td>\n<\/tr>\n | ||||||
8<\/td>\n | Introduction <\/td>\n<\/tr>\n | ||||||
9<\/td>\n | Contents <\/td>\n<\/tr>\n | ||||||
11<\/td>\n | 1. Overview 1.1 Scope 1.2 Purpose 1.3 Word usage <\/td>\n<\/tr>\n | ||||||
12<\/td>\n | 2. Normative references 3. Definitions, acronyms, and abbreviations 3.1 Definitions <\/td>\n<\/tr>\n | ||||||
14<\/td>\n | 3.2 Acronyms and abbreviations 4. Symbols and operators 4.1 Arithmetic operators <\/td>\n<\/tr>\n | ||||||
15<\/td>\n | 4.2 Logical operator 4.3 Relational operators <\/td>\n<\/tr>\n | ||||||
16<\/td>\n | 4.4 Bitwise operators 4.5 Assignment operators 5. Framework of convolutional neural network representation and model compression <\/td>\n<\/tr>\n | ||||||
17<\/td>\n | 6. Syntax and semantics of neural network models 6.1 Data structure 6.1.1 Data structure of neural network structure <\/td>\n<\/tr>\n | ||||||
18<\/td>\n | 6.1.2 Data structure of neural network parameters <\/td>\n<\/tr>\n | ||||||
19<\/td>\n | 6.2 Syntax description 6.2.1 Overview <\/td>\n<\/tr>\n | ||||||
20<\/td>\n | 6.2.2 Definition of model structure 6.2.3 Definition of contributor list 6.2.4 Definition of computational graph <\/td>\n<\/tr>\n | ||||||
21<\/td>\n | 6.2.5 Definition of operator node 6.2.6 Definition of variable node 6.2.7 Definition of attribute <\/td>\n<\/tr>\n | ||||||
22<\/td>\n | 6.2.8 Definition of other type 6.2.9 Definition of tensor type <\/td>\n<\/tr>\n | ||||||
23<\/td>\n | 6.2.10 Definition of tensor 6.2.11 Definition of tensor size <\/td>\n<\/tr>\n | ||||||
24<\/td>\n | 6.2.12 Definition of dimension 6.3 Semantic description <\/td>\n<\/tr>\n | ||||||
71<\/td>\n | 6.4 Definition of training operator 6.4.1 Loss function <\/td>\n<\/tr>\n | ||||||
73<\/td>\n | 6.4.2 Definition of inverse operator <\/td>\n<\/tr>\n | ||||||
75<\/td>\n | 7. Compression process 7.1 Multiple models 7.1.1 Definition of multiple models technology <\/td>\n<\/tr>\n | ||||||
76<\/td>\n | 7.1.2 Compression of multiple models <\/td>\n<\/tr>\n | ||||||
77<\/td>\n | 7.1.3 Shared compression operator for weights of multiple model layers 7.1.3.1 Definition 7.1.3.2 Weight aggregation <\/td>\n<\/tr>\n | ||||||
79<\/td>\n | 7.1.4 Residual quantization compression 7.1.4.1 Definition of residual quantization for multiple models 7.1.4.2 Weight sharing <\/td>\n<\/tr>\n | ||||||
81<\/td>\n | 7.2 Quantization 7.2.1 Definition <\/td>\n<\/tr>\n | ||||||
82<\/td>\n | 7.2.2 Basic quantization operator <\/td>\n<\/tr>\n | ||||||
83<\/td>\n | 7.2.2.1 Linear quantization <\/td>\n<\/tr>\n | ||||||
84<\/td>\n | 7.2.2.2 Codebook quantization <\/td>\n<\/tr>\n | ||||||
86<\/td>\n | 7.2.3 Parameter quantization operator 7.2.3.1 Nonlinear function mapping <\/td>\n<\/tr>\n | ||||||
88<\/td>\n | 7.2.3.2 INT4 parameter quantization <\/td>\n<\/tr>\n | ||||||
90<\/td>\n | 7.2.3.3 Parameter quantization for bounded ReLU <\/td>\n<\/tr>\n | ||||||
91<\/td>\n | 7.2.4 Activate quantization operator 7.2.4.1 Trainable alpha quantization 7.2.4.2 INT4 activation quantization <\/td>\n<\/tr>\n | ||||||
93<\/td>\n | 7.2.4.3 Activation quantization for bounded ReLU <\/td>\n<\/tr>\n | ||||||
95<\/td>\n | 7.2.4.4 Ratio synchronization quantization <\/td>\n<\/tr>\n | ||||||
97<\/td>\n | 7.3 Pruning 7.3.1 Overview <\/td>\n<\/tr>\n | ||||||
98<\/td>\n | 7.3.2 Pruning operator <\/td>\n<\/tr>\n | ||||||
100<\/td>\n | 7.4 Structured matrix 7.4.1 Structured matrix compression <\/td>\n<\/tr>\n | ||||||
101<\/td>\n | 7.4.2 Method for the compression of block circulant matrix with signed vectors 7.4.2.1 Block circulant matrix compression operator <\/td>\n<\/tr>\n | ||||||
102<\/td>\n | 7.4.2.2 Random vector dimension list and random vector generation operator <\/td>\n<\/tr>\n | ||||||
104<\/td>\n | 7.4.3 Method for the low-rank sparse decomposed structured matrix 7.4.3.1 Definition 7.4.3.2 Decomposition compression operator for the convolutional layers in low-rank sparse decomposed structured matrix <\/td>\n<\/tr>\n | ||||||
106<\/td>\n | 7.4.3.3 Compression operator of a fully connected or 1 \u00d7 1 convolutional layer in a low-rank sparse decomposed structured matrix <\/td>\n<\/tr>\n | ||||||
107<\/td>\n | 8. Decompression process 8.1 Multiple models 8.1.1 Decompression for multiple models <\/td>\n<\/tr>\n | ||||||
108<\/td>\n | 8.1.2 Decompression operator for weights of multiple model layers 8.1.2.1 Decompression for weights of multiple model layers 8.1.2.2 Decompression output multiple models <\/td>\n<\/tr>\n | ||||||
110<\/td>\n | 8.1.2.3 Decompression output specific model <\/td>\n<\/tr>\n | ||||||
111<\/td>\n | 8.1.2.4 Decompression output switched specific models <\/td>\n<\/tr>\n | ||||||
112<\/td>\n | 8.1.3 Decompression of residual quantization for multiple models 8.1.3.1 Definition of decompression 8.1.3.2 Decompression of the output target model <\/td>\n<\/tr>\n | ||||||
113<\/td>\n | 8.2 Dequantization 8.2.1 Definition 8.2.2 Basic dequantization operator <\/td>\n<\/tr>\n | ||||||
115<\/td>\n | 8.2.2.1 Linear dequantization <\/td>\n<\/tr>\n | ||||||
116<\/td>\n | 8.2.2.2 Codebook dequantization <\/td>\n<\/tr>\n | ||||||
117<\/td>\n | 8.2.3 Parameter dequantization operator 8.2.3.1 Nonlinear function mapping dequantization <\/td>\n<\/tr>\n | ||||||
118<\/td>\n | 8.2.3.2 INT4 parameter dequantization <\/td>\n<\/tr>\n | ||||||
119<\/td>\n | 8.2.4 Activate dequantization operator 8.2.4.1 Trainable alpha value dequantization <\/td>\n<\/tr>\n | ||||||
120<\/td>\n | 8.2.4.2 INT4 activation dequantization <\/td>\n<\/tr>\n | ||||||
121<\/td>\n | 8.3 Inverse sparsity\/inverse pruning operator 8.3.1 Definition <\/td>\n<\/tr>\n | ||||||
122<\/td>\n | 8.3.2 Inverse sparsity <\/td>\n<\/tr>\n | ||||||
123<\/td>\n | 8.4 Structured matrix 8.4.1 Decompression of structured matrix <\/td>\n<\/tr>\n | ||||||
124<\/td>\n | 8.4.2 Method for the decompression of block circulant matrix with signed vectors 8.4.2.1 Block circulant matrix decompression operator <\/td>\n<\/tr>\n | ||||||
126<\/td>\n | 8.4.2.2 Disturbance vector generation operator <\/td>\n<\/tr>\n | ||||||
128<\/td>\n | 8.4.2.3 Operator on the layers using signed vector and block circulant matrix techniques <\/td>\n<\/tr>\n | ||||||
129<\/td>\n | 8.4.3 Methods for the decompression of low-rank sparse decomposed structured matrix 8.4.3.1 Overview 8.4.3.2 Decompression operator for low-rank sparse decomposed structured matrix <\/td>\n<\/tr>\n | ||||||
130<\/td>\n | 8.4.3.3 Decompression operator for the fully connected and 1 \u00d7 1 layers in low-rank sparse decomposed structured matrix <\/td>\n<\/tr>\n | ||||||
131<\/td>\n | 9. Data generation 9.1 Definition 9.2 Training data generation method 9.2.1 Method of generating training data based on real data 9.2.1.1 Overview 9.2.1.2 Data augmentation method <\/td>\n<\/tr>\n | ||||||
133<\/td>\n | 9.2.1.3 Generating data using the GAN <\/td>\n<\/tr>\n | ||||||
135<\/td>\n | 9.2.2 Data-free training data generation method 9.2.2.1 Overview 9.2.2.2 Generating training data using the GAN <\/td>\n<\/tr>\n | ||||||
138<\/td>\n | 9.3 Multiple models 9.3.1 Method for weight generation in multiple models 9.3.1.1 Multiple models weight update operator <\/td>\n<\/tr>\n | ||||||
139<\/td>\n | 9.3.1.2 Multiple models weight shared data generation approach 1 <\/td>\n<\/tr>\n | ||||||
141<\/td>\n | 9.3.1.3 Multiple models weight shared data generation approach 2 <\/td>\n<\/tr>\n | ||||||
143<\/td>\n | 9.3.2 Residual quantization training method for multiple models <\/td>\n<\/tr>\n | ||||||
144<\/td>\n | 9.4 Quantization 9.4.1 Parameter quantization 9.4.1.1 Data generation for INT4 parameter quantization <\/td>\n<\/tr>\n | ||||||
150<\/td>\n | 9.4.1.2 Interval shrinkage quantization data generation <\/td>\n<\/tr>\n | ||||||
154<\/td>\n | 9.4.2 Activate quantization 9.4.2.1 Data generation for INT4 activate quantization <\/td>\n<\/tr>\n | ||||||
158<\/td>\n | 9.4.2.2 Trainable alpha quantization training data generation <\/td>\n<\/tr>\n | ||||||
159<\/td>\n | 9.5 Pruning 9.5.1 Overview <\/td>\n<\/tr>\n | ||||||
160<\/td>\n | 9.5.2 Sparse data generation method <\/td>\n<\/tr>\n | ||||||
163<\/td>\n | 9.5.3 Incremental regularization pruning <\/td>\n<\/tr>\n | ||||||
167<\/td>\n | 9.6 Structured matrix 9.6.1 Data generation of structured matrix <\/td>\n<\/tr>\n | ||||||
168<\/td>\n | 9.6.2 Approach for generating data to be compressed in block circulant matrix with signed vectors <\/td>\n<\/tr>\n | ||||||
172<\/td>\n | 9.6.3 Approach for generating the weight in low-rank sparse decomposed structured matrix 9.6.3.1 Overview <\/td>\n<\/tr>\n | ||||||
173<\/td>\n | 9.6.3.2 Approaches for determining hyper-parameter R1, R2, groups and core_size 9.6.3.3 Process for the generation of weights of a low-rank sparse decomposed structured matrix <\/td>\n<\/tr>\n | ||||||
175<\/td>\n | 10. Compressed representation of neural network 10.1 Specification of syntax and semantics <\/td>\n<\/tr>\n | ||||||
180<\/td>\n | 10.2 Synatx 10.2.1 Neural network compression (NNC) bitstream syntax <\/td>\n<\/tr>\n | ||||||
181<\/td>\n | 10.2.2 NNC header syntax <\/td>\n<\/tr>\n | ||||||
182<\/td>\n | 10.2.3 NNC layer header syntax <\/td>\n<\/tr>\n | ||||||
183<\/td>\n | 10.2.4 NNC 1D array syntax 10.2.5 NNC CTU3D syntax 10.2.6 NNC CTU3D header syntax <\/td>\n<\/tr>\n | ||||||
184<\/td>\n | 10.2.7 NNC zdep_array syntax <\/td>\n<\/tr>\n | ||||||
185<\/td>\n | 10.2.8 NNC CU3D syntax <\/td>\n<\/tr>\n | ||||||
186<\/td>\n | 10.2.9 NNC predicted_codebook syntax 10.2.10 NNC sygnalled_codebook syntax <\/td>\n<\/tr>\n | ||||||
187<\/td>\n | 10.2.11 NNC unitree3d syntax <\/td>\n<\/tr>\n | ||||||
188<\/td>\n | 10.2.12 NNC octree3d syntax <\/td>\n<\/tr>\n | ||||||
190<\/td>\n | 10.2.13 NNC tagtree3d syntax <\/td>\n<\/tr>\n | ||||||
192<\/td>\n | 10.2.14 NNC uni_tagtree3d syntax <\/td>\n<\/tr>\n | ||||||
194<\/td>\n | 10.2.15 NNC escape syntax 10.3 Semantics 10.3.1 Initialization 10.3.2 NNC bitstream semantics <\/td>\n<\/tr>\n | ||||||
195<\/td>\n | 10.3.3 NNC header semantics 10.3.4 NNC layer header semantics <\/td>\n<\/tr>\n | ||||||
196<\/td>\n | 10.3.5 NNC 1D array semantics 10.3.6 NNC CTU3D semantics 10.3.7 NNC CU3D header semantics 10.3.8 NNC zdep_array semantics 10.3.9 NNC CU3D semantics <\/td>\n<\/tr>\n | ||||||
197<\/td>\n | 10.3.10 NNC predicted codebook semantics 10.3.11 NNC signaled codebook semantics 10.3.12 NNC unitree3d semantics <\/td>\n<\/tr>\n | ||||||
199<\/td>\n | 10.3.13 NNC octree3d semantics <\/td>\n<\/tr>\n | ||||||
200<\/td>\n | 10.3.14 NNC tagtree3d semantics <\/td>\n<\/tr>\n | ||||||
202<\/td>\n | 10.3.15 NNC uni_tagtree3d semantics <\/td>\n<\/tr>\n | ||||||
203<\/td>\n | 10.3.16 NNC escape semantics <\/td>\n<\/tr>\n | ||||||
204<\/td>\n | 10.4 Parsing process 10.4.1 Description 10.4.2 Initialization 10.4.2.1 Initialization of context model 10.4.2.2 Initialization of AEC decoder 10.4.3 Parsing binary string 10.4.3.1 Description <\/td>\n<\/tr>\n | ||||||
205<\/td>\n | 10.4.3.2 Determine ctxIdx <\/td>\n<\/tr>\n | ||||||
208<\/td>\n | 10.4.3.3 Parsing bins 10.4.3.3.1 Parsing process 10.4.3.3.2 decode_decision <\/td>\n<\/tr>\n | ||||||
209<\/td>\n | 10.4.3.3.3 decode_aec_stuffing_bit 10.4.3.3.4 decode_bypass 10.4.3.3.5 update_ctx <\/td>\n<\/tr>\n | ||||||
210<\/td>\n | 10.4.3.4 Binarization 10.4.3.4.1 Description <\/td>\n<\/tr>\n | ||||||
212<\/td>\n | 10.4.3.4.2 Binarization for fix length code (FL) 10.4.3.4.3 Binarization for unary code (U) <\/td>\n<\/tr>\n | ||||||
213<\/td>\n | 10.4.3.4.4 Binarization for truncated unary code (TU) 10.4.3.4.5 kth-order Exp-Golomb codes (EGk) <\/td>\n<\/tr>\n | ||||||
214<\/td>\n | 10.4.3.4.6 Joint truncated unary code and kth-order Exp-Golomb codes (UEGk) <\/td>\n<\/tr>\n | ||||||
215<\/td>\n | 10.5 Decoding process 10.5.1 General decoding process 10.5.2 Decoding NNC header 10.5.3 Decoding NNC layer header 10.5.4 Decoding NNC sublayer <\/td>\n<\/tr>\n | ||||||
217<\/td>\n | 10.5.5 Decoding 1D array 10.5.6 NNC CTU3D semantics <\/td>\n<\/tr>\n | ||||||
218<\/td>\n | 10.5.7 Decoding CTU3D 10.5.8 Decoding ZdepArray 10.5.9 Decoding CU3D <\/td>\n<\/tr>\n | ||||||
219<\/td>\n | 10.5.10 Decoding predicted codebook 10.5.11 Decoding signalled codebook 10.5.12 Decoding unitree3d <\/td>\n<\/tr>\n | ||||||
220<\/td>\n | 10.5.13 Decoding octree3d 10.5.14 Decoding tagtree3d 10.5.15 Decoding uni_tagtree3d <\/td>\n<\/tr>\n | ||||||
221<\/td>\n | 10.5.16 Decoding escape <\/td>\n<\/tr>\n | ||||||
222<\/td>\n | 11. Model protection 11.1 Model protection definition <\/td>\n<\/tr>\n | ||||||
223<\/td>\n | 11.2 Model encryption process <\/td>\n<\/tr>\n | ||||||
224<\/td>\n | 11.3 Model decryption process <\/td>\n<\/tr>\n | ||||||
225<\/td>\n | 11.4 Cipher model data structure definition <\/td>\n<\/tr>\n | ||||||
226<\/td>\n | Back Cover <\/td>\n<\/tr>\n<\/table>\n","protected":false},"excerpt":{"rendered":" IEEE Standard for Artificial Intelligence (AI) Model Representation, Compression, Distribution, and Management<\/b><\/p>\n |