Add support for Quark onnx quantization by gengxinwu · Pull Request #2236 · microsoft/Olive

gengxinwu · 2025-10-31T12:27:45Z

Describe your changes

What it does:

The QuarkQuantization pass can also quantize ONNX format models through olive run interface
Support int8/uint8/int16/uint16/int32/uint32/bf16/bfp16 quantization data types
Support commonly used algorithms, like CLE, SmoothQuant, AdaRound and AdaQuant

What is next:

Add integration with olive quantize interface
Integrate higher versions of amd-quark
Support more data formats, including MX formats
Support more algorithms, like GPTQ, QuaRot ...

Checklist before requesting a review

Add unit tests for this change.
Make sure all tests can pass.
Update documents if necessary.
Lint and apply fixes to your code by running lintrunner -a
Is this a user-facing change? If yes, give a description of this change to be included in the release notes.

(Optional) Issue link

…nfig

Copilot

Pull Request Overview

This PR adds ONNX model quantization support to the Quark quantizer pass. The existing QuarkQuantization pass only supported HuggingFace models; now it supports both ONNX and HuggingFace models.

Key changes:

Extended QuarkQuantization pass to handle ONNXModelHandler in addition to HfModelHandler
Added new ONNX-specific quantization logic and configuration preparation utilities
Included test coverage for the new ONNX quantization functionality

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
test/requirements-test.txt	Added amd-quark dependency version 0.10
test/passes/quark_quantizer/init.py	Created package initialization file for quark quantizer tests
test/passes/quark_quantizer/test_quark_onnx_quantization.py	Added test case for static QDQ U8S8 quantization
olive/passes/quark_quantizer/quark_quantization.py	Extended pass to support ONNX models with new configuration parameters and _run_quark_onnx method
olive/passes/quark_quantizer/onnx/init.py	Created package initialization file for ONNX quantizer
olive/passes/quark_quantizer/onnx/quantize_quark.py	Implemented ONNX model quantization using Quark's ModelQuantizer
olive/passes/quark_quantizer/onnx/configuration_preparation.py	Added configuration mapping utilities for converting dictionaries to Quark ONNX config objects

Copilot · 2025-11-03T18:21:42Z

olive/passes/quark_quantizer/onnx/configuration_preparation.py

+    else:
+        # TODO(Gengxin): Configure the rest algorithms
+        pass
+
+


This TODO comment indicates incomplete implementation. The update_algo_config function handles only AdaRoundConfig, AdaQuantConfig, CLEConfig, and SmoothQuantConfig, but the algorithm_mapping dictionary includes GPTQConfig, AutoMixprecisionConfig, and QuarotConfig which are not configured. Consider either implementing the missing algorithm configurations or documenting which algorithms are intentionally not yet supported.

Suggested change

else:

# TODO(Gengxin): Configure the rest algorithms

pass

elif isinstance(algo_config, GPTQConfig):

# TODO: Implement configuration for GPTQConfig if/when fields are known

logger.warning("GPTQConfig configuration is not yet supported. Using default values.")

elif isinstance(algo_config, AutoMixprecisionConfig):

# TODO: Implement configuration for AutoMixprecisionConfig if/when fields are known

logger.warning("AutoMixprecisionConfig configuration is not yet supported. Using default values.")

elif isinstance(algo_config, QuarotConfig):

# TODO: Implement configuration for QuarotConfig if/when fields are known

logger.warning("QuarotConfig configuration is not yet supported. Using default values.")

else:

logger.warning("Unknown algorithm config type: %s. No configuration applied.", type(algo_config).__name__)

Copilot · 2025-11-03T18:21:42Z

olive/passes/quark_quantizer/quark_quantization.py

+            "exclude": PassConfigParam(
+                type_=dict,
+                default_value=None,
+                description="List of nodes or subgraphs excluded from quantization. Default is None.",


The description states 'List of nodes or subgraphs' but the type is declared as dict. This is inconsistent. Either update the description to match the dict type (e.g., 'Dictionary defining nodes or subgraphs excluded from quantization') or change the type to list if it should actually be a list.

Suggested change

description="List of nodes or subgraphs excluded from quantization. Default is None.",

description="Dictionary defining nodes or subgraphs excluded from quantization. Default is None.",

Merge in AITEC/eiq-olive from feature/EITO-565-rebase-to-newest-version-of-olive-0.9.3 to main * commit 'fd44fa6a51e382d59a88e4fceec49042b7e2caa5': (370 commits) ruff safe fixes update rebased on badge readme fix import things I've missed during rebasing ruff Revert "ruff stuff" ruff stuff Bump up version to 0.10.1 Fix cache output model name bug (microsoft#2249) HfModelHandler: Check for tokenizer_config.json instead of try/else (microsoft#2247) Quantization: Keep embeddings tied in SelectiveMixedPrecision, Clean overrides (microsoft#2246) TieWordEmbeddings: return model when no tieing detected (microsoft#2242) Static Quantization: Always patch `MinMaxCalibrator` (microsoft#2241) Release branch 0.10.0 Add custom onnx model name support for output dir (microsoft#2235) TieWordEmbeddings: unquantized and quantized support (microsoft#2240) Quantization: Embeddings quantization, new packing format, Rtn quantizer (microsoft#2238) Add support for Quark onnx quantization (microsoft#2236) Spelling fixes (microsoft#2234) LLMAugmentedDataLoader: No decode phase for non-GQA model (microsoft#2204) ...

gengxinwu added 13 commits October 23, 2025 16:34

[feat] add a folder for quark onnx quantizer

410dd74

[feat] quantize a resnet50 using quark onnx successfully

f79a902

[feat] support configuring smoothquant

f68c6ec

[improve] add quark version check

45d88e1

[fix] fix a bug

a0ad5e4

[fix] run lintrunner

69ec2a9

[fix] add author to TODO string

1ff72fa

[fix] ignore an advice

8a16705

[feat] add a testcase for quark-onnx quantization

21ba377

[fix] rename the folder

ef6a96d

[fix] update test/requirements-test.txt

520dd46

[improve] support configuring specific_layer_config and layer_type_co…

235b09d

…nfig

[fix] lint the test case

63fdf7d

xiaoyu-work requested a review from Copilot November 3, 2025 18:18

Copilot AI reviewed Nov 3, 2025

View reviewed changes

xiaoyu-work approved these changes Nov 3, 2025

View reviewed changes

xiaoyu-work merged commit 2df25a7 into microsoft:main Nov 3, 2025
17 checks passed

gengxinwu mentioned this pull request Nov 6, 2025

Add timm mobilenetv3 recipe microsoft/olive-recipes#162

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Quark onnx quantization#2236

Add support for Quark onnx quantization#2236
xiaoyu-work merged 13 commits intomicrosoft:mainfrom
gengxinwu:integrate-quark-onnx

gengxinwu commented Oct 31, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 3, 2025

Uh oh!

Copilot AI Nov 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-    else:
-        # TODO(Gengxin): Configure the rest algorithms
-        pass
+    elif isinstance(algo_config, GPTQConfig):
+        # TODO: Implement configuration for GPTQConfig if/when fields are known
+        logger.warning("GPTQConfig configuration is not yet supported. Using default values.")
+    elif isinstance(algo_config, AutoMixprecisionConfig):
+        # TODO: Implement configuration for AutoMixprecisionConfig if/when fields are known
+        logger.warning("AutoMixprecisionConfig configuration is not yet supported. Using default values.")
+    elif isinstance(algo_config, QuarotConfig):
+        # TODO: Implement configuration for QuarotConfig if/when fields are known
+        logger.warning("QuarotConfig configuration is not yet supported. Using default values.")
+    else:
+        logger.warning("Unknown algorithm config type: %s. No configuration applied.", type(algo_config).__name__)

	description="List of nodes or subgraphs excluded from quantization. Default is None.",
	description="Dictionary defining nodes or subgraphs excluded from quantization. Default is None.",

Conversation

gengxinwu commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes

Checklist before requesting a review

(Optional) Issue link

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gengxinwu commented Oct 31, 2025 •

edited

Loading