bnorm can fail with out_of_memory on cpu

# Summary
`bnorm` tests via `benchdnn` do not gracefully skip in cases where the required scratchpad memory is too large. This is causing failures in the AArch64 Nightly pipeline in `test_benchdnn_modeC_bnorm_regressions_cpu`.

# Version
8d4aa8b433067551f05621879f729c23384defb0

# Environment
Reproduced on x64 and AArch64.
hash: 8d4aa8b433067551f05621879f729c23384defb0 (introduced the failing case, but the issue already existed).

# Steps to reproduce
```
$ ./build/tests/benchdnn/benchdnn --bnorm --dt=bf16 --inplace=true mb1ic512ih65536
Error: Function 'create_primitive' at (oneDNN/tests/benchdnn/dnnl_common.hpp:469) returned 'out_of_memory'
Error: Function 'init_prim' at (oneDNN/tests/benchdnn/dnnl_common.hpp:523) returned '1'
Error: Function 'createit' at (oneDNN/tests/benchdnn/bnorm/bnorm.cpp:710) returned '1'
Error: Function 'create' at (oneDNN/tests/benchdnn/utils/task.hpp:57) returned '1'
0:UNTESTED_FAILED (0 ms) __REPRO: --bnorm --dt=bf16 --inplace=true mb1ic512ih65536
===========================================================
= Failed cases summary (--summary=no-failures to disable) =
===========================================================
0:UNTESTED_FAILED (0 ms) __REPRO: --bnorm --dt=bf16 --inplace=true mb1ic512ih65536
============================
tests:1 passed:0 skipped:0 mistrusted:0 unimplemented:0 invalid_arguments:0 failed:1 listed:0
total: 0.00s; create_pd: 0.00s (3%); create_prim: 0.00s (0%); fill: 0.00s (0%); execute: 0.00s (0%); compute_ref: 0.00s (0%); compare: 0.00s (0%);
```

# Observed behavior
Fails with status `UNTESTED_FAILED`

# Expected behavior
Expected status `SKIPPED`. The `gpu` backend seems to have some handling for skipping large cases though I have not tested it.
https://github.com/uxlfoundation/oneDNN/blob/db17ac9b72049677cc837224a5daac14517bbcdc/tests/benchdnn/dnnl_common.hpp#L444-L467

	// The library scratchpad is allocated at create_primitive stage. The memory
	// check is moved after the creation stage. It's necessary to check the
	// library scratchpad size against gpu_max_alloc, otherwise, out_of_memory
	// would be issued by the library.
	if (res->mem_size_args.scratchpad_size > 0 && is_gpu()
	&& query_scratchpad_mode(query_attr(pdw))
	== dnnl_scratchpad_mode_library) {
	static size_t gpu_device_capacity = 0;
	static size_t gpu_max_alloc_capacity = 0;
	SAFE(get_gpu_ram_sizes(gpu_device_capacity, gpu_max_alloc_capacity),
	WARN);
	const bool fit
	= res->mem_size_args.scratchpad_size < gpu_max_alloc_capacity;
	if (!fit) {
	BENCHDNN_PRINT(1,
	"[CHECK_MEM]: Size of the scratchpad %s "
	"doesn't fit the allocation limit of %s.\n",
	smart_bytes(res->mem_size_args.scratchpad_size).c_str(),
	smart_bytes(gpu_max_alloc_capacity).c_str());
	res->state = SKIPPED;
	res->reason = skip_reason::not_enough_ram;
	return OK;
	}
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bnorm can fail with out_of_memory on cpu #4964

Summary

Version

Environment

Steps to reproduce

Observed behavior

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bnorm can fail with out_of_memory on cpu #4964

Description

Summary

Version

Environment

Steps to reproduce

Observed behavior

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions