Skip to content

Commit 5f78ec3

Browse files
authored
Do not convert weight scale to e4m3fnuz on CUDA (#2917)
1 parent 922cc38 commit 5f78ec3

File tree

1 file changed

+1
-1
lines changed
  • server/text_generation_server/layers/compressed_tensors

1 file changed

+1
-1
lines changed

server/text_generation_server/layers/compressed_tensors/w8an_fp.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -147,7 +147,7 @@ def get_multi_weights_col(self, weights: "Weights", prefixes: List[str], dim: in
147147
else None
148148
)
149149

150-
if self.load_weight_scale or SYSTEM == "rocm":
150+
if self.load_weight_scale and SYSTEM == "rocm":
151151
w, weight_scale, input_scale = normalize_e4m3fn_to_e4m3fnuz(
152152
w, weight_scale, input_scale
153153
)

0 commit comments

Comments
 (0)