Discussion:
[FFmpeg-devel] [PATCH] nvenc: Compensate for hardware trying to mess with aspect ratio of DVD content.
Philip Langdale
2015-01-17 19:35:41 UTC
Permalink
There is a long sad story behind all this, but it's somewhat ambiguous as to
whether DVD content should be treated as 720 pixels wide or 704 pixels, with
16 pixels cut off. If you decide is should be 704 pixels wide, you need to
adjust the sample aspect ratio to keep the final display aspect ratio correct.

For reasons we are not privy too, nvidia decided that the nvenc encoder should
apply this aspect correction, whether you want it to or not. (I guess there
might be a flag for it, but if there is it's not documented). So, if you want
to transcode DVD content at the original size, you need to adjust the aspect
ratio information you pass to the encoder to compensate for their 'correction'.

This 'correction' is only applied to 720x480 and 720x576 content - and it does
so regardless of the input aspect ratio.
---
libavcodec/nvenc.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/libavcodec/nvenc.c b/libavcodec/nvenc.c
index efa3f04..a51ada2 100644
--- a/libavcodec/nvenc.c
+++ b/libavcodec/nvenc.c
@@ -587,6 +587,13 @@ static av_cold int nvenc_encode_init(AVCodecContext *avctx)
ctx->init_encode_params.darWidth = avctx->width;
}

+ // Compensate for hardware assuming playback will be at 704 pixel width.
+ if (avctx->width == 720 &&
+ (avctx->height == 480 || avctx->height == 576)) {
+ ctx->init_encode_params.darWidth *= 44;
+ ctx->init_encode_params.darHeight *= 45;
+ }
+
ctx->init_encode_params.frameRateNum = avctx->time_base.den;
ctx->init_encode_params.frameRateDen = avctx->time_base.num * avctx->ticks_per_frame;
--
2.1.0
Nicolas George
2015-01-17 19:47:51 UTC
Permalink
Post by Philip Langdale
There is a long sad story behind all this, but it's somewhat ambiguous as to
whether DVD content should be treated as 720 pixels wide or 704 pixels, with
16 pixels cut off. If you decide is should be 704 pixels wide, you need to
adjust the sample aspect ratio to keep the final display aspect ratio correct.
For reasons we are not privy too, nvidia decided that the nvenc encoder should
apply this aspect correction, whether you want it to or not. (I guess there
might be a flag for it, but if there is it's not documented). So, if you want
to transcode DVD content at the original size, you need to adjust the aspect
ratio information you pass to the encoder to compensate for their 'correction'.
This 'correction' is only applied to 720x480 and 720x576 content - and it does
so regardless of the input aspect ratio.
Thanks for testing.
Post by Philip Langdale
---
libavcodec/nvenc.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/libavcodec/nvenc.c b/libavcodec/nvenc.c
index efa3f04..a51ada2 100644
--- a/libavcodec/nvenc.c
+++ b/libavcodec/nvenc.c
@@ -587,6 +587,13 @@ static av_cold int nvenc_encode_init(AVCodecContext *avctx)
ctx->init_encode_params.darWidth = avctx->width;
}
+ // Compensate for hardware assuming playback will be at 704 pixel width.
Nit: de-compensate for buggy encoders that compensate for...
Post by Philip Langdale
+ if (avctx->width == 720 &&
+ (avctx->height == 480 || avctx->height == 576)) {
+ ctx->init_encode_params.darWidth *= 44;
+ ctx->init_encode_params.darHeight *= 45;
+ }
Here, and also in Timo's patch, I would advice to use lavu's rational
operators: they will handle overflow better and reduce the result.
Post by Philip Langdale
+
ctx->init_encode_params.frameRateNum = avctx->time_base.den;
ctx->init_encode_params.frameRateDen = avctx->time_base.num * avctx->ticks_per_frame;
Regards,
--
Nicolas George
Philip Langdale
2015-01-17 20:01:49 UTC
Permalink
There is a long sad story behind all this, but it's somewhat ambiguous as to
whether DVD content should be treated as 720 pixels wide or 704 pixels, with
16 pixels cut off. If you decide is should be 704 pixels wide, you need to
adjust the sample aspect ratio to keep the final display aspect ratio correct.

For reasons we are not privy too, nvidia decided that the nvenc encoder should
apply this aspect correction, whether you want it to or not. (I guess there
might be a flag for it, but if there is it's not documented). So, if you want
to transcode DVD content at the original size, you need to adjust the aspect
ratio information you pass to the encoder to compensate for their 'correction'.

This 'correction' is only applied to 720x480 and 720x576 content - and it does
so regardless of the input aspect ratio.
---
libavcodec/nvenc.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/libavcodec/nvenc.c b/libavcodec/nvenc.c
index efa3f04..8a0d584 100644
--- a/libavcodec/nvenc.c
+++ b/libavcodec/nvenc.c
@@ -587,6 +587,18 @@ static av_cold int nvenc_encode_init(AVCodecContext *avctx)
ctx->init_encode_params.darWidth = avctx->width;
}

+ // De-compensate for hardware, dubiously, trying to compensate for
+ // playback at 704 pixel width.
+ if (avctx->width == 720 &&
+ (avctx->height == 480 || avctx->height == 576)) {
+ av_reduce(&dw, &dh,
+ ctx->init_encode_params.darWidth * 44,
+ ctx->init_encode_params.darHeight * 45,
+ 1024 * 1204);
+ ctx->init_encode_params.darHeight = dh;
+ ctx->init_encode_params.darWidth = dw;
+ }
+
ctx->init_encode_params.frameRateNum = avctx->time_base.den;
ctx->init_encode_params.frameRateDen = avctx->time_base.num * avctx->ticks_per_frame;
--
2.1.0
Kieran Kunhya
2015-01-17 20:11:33 UTC
Permalink
Post by Philip Langdale
There is a long sad story behind all this, but it's somewhat ambiguous as to
whether DVD content should be treated as 720 pixels wide or 704 pixels, with
16 pixels cut off. If you decide is should be 704 pixels wide, you need to
adjust the sample aspect ratio to keep the final display aspect ratio correct.
BT601 makes this very clear. The active picture is 702 pixels.

Kieran
Nicolas George
2015-01-17 20:42:43 UTC
Permalink
Post by Kieran Kunhya
BT601 makes this very clear. The active picture is 702 pixels.
There are two fundamental flaws with your reasoning:

First, BT601 only applies to a some kind of videos. Wikipedia tells me it
applies to "encoding interlaced analog video signals in digital video form";
do you think it is wrong? For all other kind of videos, all this nonsense is
just irrelevant.

Second, and that is more important, the SAR→DAR conversion is not a matter
of conforming to a certain video standard, it is a matter of basic
arithmetic. Arithmetic has been around a lot longer than BT601, or any
industrial standard for video. If you have a "BT601 720ish×576 16/9 video",
then SAR must be (16/9) / (702/576) = 512/351, nothing else; reversely, if
the SAR is 64/45, that means the video is really 720×576 for 16/9 without
any nonsense.

Regards,
--
Nicolas George
Kieran Kunhya
2015-01-17 22:49:26 UTC
Permalink
Post by Nicolas George
Post by Kieran Kunhya
BT601 makes this very clear. The active picture is 702 pixels.
First, BT601 only applies to a some kind of videos. Wikipedia tells me it
applies to "encoding interlaced analog video signals in digital video form";
do you think it is wrong? For all other kind of videos, all this nonsense is
just irrelevant.
And the digital signals that are recorded by my camera need to be
backwards compatible with the hundreds of millions of analogue TVs.
This forms almost the entirety of the set of 720x480 or 720x576 videos.

I don't make the standards and frankly whether you dislike them is
your problem but they exist and need to work correctly.
Instead you wish to break things based off an artificial test pattern
and your own beliefs.

Wikipedia seems to be wrong in a number of areas.

Kieran
Nicolas George
2015-01-17 23:00:26 UTC
Permalink
Post by Kieran Kunhya
I don't make the standards and frankly whether you dislike them is
your problem but they exist and need to work correctly.
Instead you wish to break things based off an artificial test pattern
and your own beliefs.
I do not wish to break things, I wish to fix them correctly.

The so-called fix in nvenc is completely wrong, because it happens at the
wrong place: it introduces exceptional cases, it violates the rules of
arithmetic, it is not documented, it affects cases that should be left alone
and misses cases it should handle, it behaved differently than other parts
of the code. Is it enough, or do you need more reasons?

If the aspect ratio needs to be fixed, it must happen as soon as the video
enters FFmpeg, i.e. at the demuxer or decoder level. Any other place is
wrong.

As is, Philip's patch fixes nvenc's misbehaviour, making it consistent with
arithmetic and other encoders: the latest version should be applied soon.
You, or anyone else who knows what they are doing, can file another patch to
properly fix the issue, if it happens there is anything to fix.

Regards,
--
Nicolas George
Kieran Kunhya
2015-01-17 23:30:38 UTC
Permalink
Post by Nicolas George
Post by Kieran Kunhya
I don't make the standards and frankly whether you dislike them is
your problem but they exist and need to work correctly.
Instead you wish to break things based off an artificial test pattern
and your own beliefs.
I do not wish to break things, I wish to fix them correctly.
The so-called fix in nvenc is completely wrong, because it happens at the
wrong place: it introduces exceptional cases, it violates the rules of
arithmetic, it is not documented, it affects cases that should be left alone
and misses cases it should handle, it behaved differently than other parts
of the code. Is it enough, or do you need more reasons?
The behaviour of the Nvidia code I believe is correct.
As far as I understand it corrects SAR for 720-width content to comply
with BT601.

Kieran
Nicolas George
2015-01-17 23:38:38 UTC
Permalink
Post by Kieran Kunhya
The behaviour of the Nvidia code I believe is correct.
As far as I understand it corrects SAR for 720-width content to comply
with BT601.
That is just not true.

Basic principle: THE COMPETENT USER IS RIGHT.

If someone knows BT601, and knowing it decide to produce a video that does
not conform to it, for example writing "testsrc=s=720x576,setsar=64/45",
then the software must comply.

There is a very simple way of flagging content that is supposed to comply
with BT601: the SAR is 512/351. If SAR is 64/45, that means someone before
nvenc decided that the video is not expected to conform with BT601, and
nvenc has no right to decide otherwise.

Regards,
--
Nicolas George
Kieran Kunhya
2015-01-17 23:46:21 UTC
Permalink
Post by Nicolas George
Post by Kieran Kunhya
The behaviour of the Nvidia code I believe is correct.
As far as I understand it corrects SAR for 720-width content to comply
with BT601.
That is just not true.
Basic principle: THE COMPETENT USER IS RIGHT.
If someone knows BT601, and knowing it decide to produce a video that does
not conform to it, for example writing "testsrc=s=720x576,setsar=64/45",
then the software must comply.
No, the user who doesn't want BT601 should override it.

You don't realise this but there are many orders of magnitude more
BT601 content than any other content with a 720x480 or 720x576
resolution. This is what the vast majority of users are processing
with FFmpeg.
Post by Nicolas George
There is a very simple way of flagging content that is supposed to comply
with BT601: the SAR is 512/351. If SAR is 64/45, that means someone before
nvenc decided that the video is not expected to conform with BT601, and
nvenc has no right to decide otherwise.
You fail to realise this but SARs are not necessarily arithmetic for
historical reasons.

Kieran
Kieran Kunhya
2015-01-17 23:48:31 UTC
Permalink
Post by Nicolas George
Post by Kieran Kunhya
The behaviour of the Nvidia code I believe is correct.
As far as I understand it corrects SAR for 720-width content to comply
with BT601.
That is just not true.
Basic principle: THE COMPETENT USER IS RIGHT.
If someone knows BT601, and knowing it decide to produce a video that does
not conform to it, for example writing "testsrc=s=720x576,setsar=64/45",
then the software must comply.
Oops I misunderstood, you mean the software must comply with the users wishes.
Anyway my argument is that BT601 should be the default for the these
resolutions.

Kieran
Nicolas George
2015-01-18 00:02:10 UTC
Permalink
Post by Kieran Kunhya
Oops I misunderstood, you mean the software must comply with the users wishes.
Yes.
Post by Kieran Kunhya
Anyway my argument is that BT601 should be the default for the these
resolutions.
If you want, but that must happen immediately when the contents enters into
FFmpeg's data structure, not at a random point in the processing chain.

In other words, it should happen immediately when FFmpeg reads contents that
is "probably" BT601:

Input stream #0:0: Video: mpeg2video (Main), yuv420p(tv),
720x576 [SAR 64:45 DAR 16:9]

should be fixed into:

Input stream #0:0: Video: mpeg2video (Main), yuv420p(tv),
720x576 [SAR 512:351 DAR 640:351]

That way, the laws of geometry and arithmetic are respected, the size of the
pixels is correct, and it will work will all encoders, not just nvenc.

Regards,
--
Nicolas George
Hendrik Leppkes
2015-01-18 01:52:59 UTC
Permalink
Post by Kieran Kunhya
Post by Kieran Kunhya
Oops I misunderstood, you mean the software must comply with the users
wishes.
Yes.
Post by Kieran Kunhya
Anyway my argument is that BT601 should be the default for the these
resolutions.
If you want, but that must happen immediately when the contents enters into
FFmpeg's data structure, not at a random point in the processing chain.
Without arguing for or against BT.601 behaviour...

nvenc should behave the same as libx264, or any other video encoder, if
this patch makes it do that, then it should be applied.

If bt601 needs special handling not yet present in avcodec, it should be
implemented in such a way that all encoders can benefit from it.

- Hendrik
Nicolas George
2015-01-18 09:40:24 UTC
Permalink
Post by Hendrik Leppkes
nvenc should behave the same as libx264, or any other video encoder, if
this patch makes it do that, then it should be applied.
If bt601 needs special handling not yet present in avcodec, it should be
implemented in such a way that all encoders can benefit from it.
That was one of the points I was trying to make, thanks for explaining it
more clearly.

Two additional points:

According to Philip's tests, nvenc adds a 45/44 = 720/704 factor, not
720/702 = 40/39, so it does not conform to BT.601 anyway.

I have downloaded BT.601 from <URL: http://www.itu.int/rec/R-REC-BT.601/ >,
both the most recent and the oldest versions, and I did not find any
explicit reference of this 702 business, nor could I decipher indirect
references, so I still wonder where it comes from exactly.

(I had previously assumed it would be insanely expensive like the MPEG
standards, but it is free and short. On the other hand, I do not expect
people who do not bother to properly configure their character encodings to
produce good standards.)

Regards,
--
Nicolas George
Kieran Kunhya
2015-01-18 11:16:04 UTC
Permalink
Post by Nicolas George
Post by Hendrik Leppkes
nvenc should behave the same as libx264, or any other video encoder, if
this patch makes it do that, then it should be applied.
If bt601 needs special handling not yet present in avcodec, it should be
implemented in such a way that all encoders can benefit from it.
That was one of the points I was trying to make, thanks for explaining it
more clearly.
According to Philip's tests, nvenc adds a 45/44 = 720/704 factor, not
720/702 = 40/39, so it does not conform to BT.601 anyway.
Why is it so hard for you to realise that the SARs are not
arithmetically derived?
Post by Nicolas George
I have downloaded BT.601 from <URL: http://www.itu.int/rec/R-REC-BT.601/ >,
both the most recent and the oldest versions, and I did not find any
explicit reference of this 702 business, nor could I decipher indirect
references, so I still wonder where it comes from exactly.
You can make the appropriate calculations *and simplifications* for
BT601. There are plenty of books with the appropriate derivations.

Kieran
Michael Niedermayer
2015-01-18 12:31:48 UTC
Permalink
Post by Kieran Kunhya
Post by Nicolas George
Post by Hendrik Leppkes
nvenc should behave the same as libx264, or any other video encoder, if
this patch makes it do that, then it should be applied.
If bt601 needs special handling not yet present in avcodec, it should be
implemented in such a way that all encoders can benefit from it.
That was one of the points I was trying to make, thanks for explaining it
more clearly.
According to Philip's tests, nvenc adds a 45/44 = 720/704 factor, not
720/702 = 40/39, so it does not conform to BT.601 anyway.
Why is it so hard for you to realise that the SARs are not
arithmetically derived?
Rec. ITU-T H.264 (04/2013)

3.133 sample aspect ratio: Specifies, for assisting the display process, which is not specified in this
Recommendation | International Standard, the ratio between the intended horizontal distance between the
columns and the intended vertical distance between the rows of the luma sample array in a frame. Sample
aspect ratio is expressed as h:v, where h is horizontal width and v is vertical height (in arbitrary units of spatial
distance).

3.62 informative: A term used to refer to content provided in this Recommendation | International Standard that is
not an integral part of this Recommendation | International Standard. Informative content does not establish
any mandatory requirements for conformance to this Recommendation | International Standard.

aspect_ratio_idc specifies the value of the sample aspect ratio of the luma samples. Table E-1 shows the meaning of the
code. When aspect_ratio_idc indicates Extended_SAR, the sample aspect ratio is represented by sar_width : sar_height.
When the aspect_ratio_idc syntax element is not present, aspect_ratio_idc value shall be inferred to be equal to 0.
Table E-1 “ Meaning of sample aspect ratio indicator
aspect_ratio_idc Sample aspect (informative)
ratio Examples of use
0 Unspecified
1 1:1 7680x4320 16:9 frame without horizontal overscan
("square") 3840x2160 16:9 frame without horizontal overscan
1280x720 16:9 frame without horizontal overscan
1920x1080 16:9 frame without horizontal overscan (cropped from 1920x1088)
640x480 4:3 frame without horizontal overscan
2 12:11 720x576 4:3 frame with horizontal overscan
352x288 4:3 frame without horizontal overscan
3 10:11 720x480 4:3 frame with horizontal overscan
352x240 4:3 frame without horizontal overscan
4 16:11 720x576 16:9 frame with horizontal overscan
528x576 4:3 frame without horizontal overscan
5 40:33 720x480 16:9 frame with horizontal overscan
528x480 4:3 frame without horizontal overscan
6 24:11 352x576 4:3 frame without horizontal overscan
480x576 16:9 frame with horizontal overscan
7 20:11 352x480 4:3 frame without horizontal overscan
480x480 16:9 frame with horizontal overscan
8 32:11 352x576 16:9 frame without horizontal overscan
9 80:33 352x480 16:9 frame without horizontal overscan
10 18:11 480x576 4:3 frame with horizontal overscan
11 15:11 480x480 4:3 frame with horizontal overscan
12 64:33 528x576 16:9 frame without horizontal overscan
13 160:99 528x480 16:9 frame without horizontal overscan
14 4:3 1440x1080 16:9 frame without horizontal overscan
15 3:2 1280x1080 16:9 frame without horizontal overscan
16 2:1 960x1080 16:9 frame without horizontal overscan
17..254 Reserved
255 Extended_SAR
NOTE 1 “ For the examples in Table E-1, the term "without horizontal overscan" refers to display processes in which the display
area matches the area of the cropped decoded pictures and the term "with horizontal overscan" refers to display processes in which
some parts near the left and/or right border of the cropped decoded pictures are not visible in the display area. As an example, the
entry "720x576 4:3 frame with horizontal overscan" for aspect_ratio_idc equal to 2 refers to having an area of 704x576 luma
samples (which has an aspect ratio of 4:3) of the cropped decoded frame (720x576 luma samples) that is visible in the display
area.
sar_width indicates the horizontal size of the sample aspect ratio (in arbitrary units).
sar_height indicates the vertical size of the sample aspect ratio (in the same arbitrary units as sar_width).
sar_width and sar_height shall be relatively prime or equal to 0. When aspect_ratio_idc is equal to 0 or sar_width is
equal to 0 or sar_height is equal to 0, the sample aspect ratio shall be considered unspecified by this Recommendation |
International Standard.

overscan_info_present_flag equal to 1 specifies that the overscan_appropriate_flag is present. When
overscan_info_present_flag is equal to 0 or is not present, the preferred display method for the video signal is
unspecified.
overscan_appropriate_flag equal to 1 indicates that the cropped decoded pictures output are suitable for display using
overscan. overscan_appropriate_flag equal to 0 indicates that the cropped decoded pictures output contain visually
important information in the entire region out to the edges of the cropping rectangle of the picture, such that the cropped
decoded pictures output should not be displayed using overscan. Instead, they should be displayed using either an exact
match between the display area and the cropping rectangle, or using underscan. As used in this paragraph, the term
"overscan" refers to display processes in which some parts near the borders of the cropped decoded pictures are not
visible in the display area. The term "underscan" describes display processes in which the entire cropped decoded
pictures are visible in the display area, but they do not cover the entire display area. For display processes that neither use
overscan nor underscan, the display area exactly matches the area of the cropped decoded pictures.
NOTE 2 “ For example, overscan_appropriate_flag equal to 1 might be used for entertainment television programming, or for a
live view of people in a videoconference, and overscan_appropriate_flag equal to 0 might be used for computer screen capture or
security camera content.


[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

There will always be a question for which you do not know the correct answer.
compn
2015-01-18 13:50:53 UTC
Permalink
On Sun, 18 Jan 2015 10:40:24 +0100
http://www.itu.int/rec/R-REC-BT.601/ >, both the most recent and the
oldest versions, and I did not find any explicit reference of this
702 business, nor could I decipher indirect references, so I still
wonder where it comes from exactly.
i've seen the "SD content is bt.601 and HD content is bt.709" edict
before, but i do not know where it comes from either.

-compn
Michael Niedermayer
2015-01-18 12:22:43 UTC
Permalink
Post by Hendrik Leppkes
Post by Kieran Kunhya
Post by Kieran Kunhya
Oops I misunderstood, you mean the software must comply with the users
wishes.
Yes.
Post by Kieran Kunhya
Anyway my argument is that BT601 should be the default for the these
resolutions.
If you want, but that must happen immediately when the contents enters into
FFmpeg's data structure, not at a random point in the processing chain.
Without arguing for or against BT.601 behaviour...
nvenc should behave the same as libx264, or any other video encoder, if
this patch makes it do that, then it should be applied.
Yes, and AFAICS an earlier mail from philip indicates this is the case
"
Stream #0:0: Video: rawvideo (RGB[24] / 0x18424752), rgb24, 720x576
[SAR 64:45 DAR 16:9], 25 tbr, 25 tbn, 25 tbc

With libx264, I get the same SAR and DAR out.

With nvenc, I get:

sample_aspect_ratio=16:11
display_aspect_ratio=20:11
"

In above example the user asks for a NON BT.601 format 720x576 at
SAR 64:45 and a DAR related to 720x576 NOT 702 or 704x576 of 16:9
yet nvenc gives her something else


[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The bravest are surely those who have the clearest vision
of what is before them, glory and danger alike, and yet
notwithstanding go out to meet it. -- Thucydides
Kieran Kunhya
2015-01-18 13:28:36 UTC
Permalink
Post by Michael Niedermayer
Post by Hendrik Leppkes
Post by Kieran Kunhya
Post by Kieran Kunhya
Oops I misunderstood, you mean the software must comply with the users
wishes.
Yes.
Post by Kieran Kunhya
Anyway my argument is that BT601 should be the default for the these
resolutions.
If you want, but that must happen immediately when the contents enters into
FFmpeg's data structure, not at a random point in the processing chain.
Without arguing for or against BT.601 behaviour...
nvenc should behave the same as libx264, or any other video encoder, if
this patch makes it do that, then it should be applied.
Yes, and AFAICS an earlier mail from philip indicates this is the case
"
Stream #0:0: Video: rawvideo (RGB[24] / 0x18424752), rgb24, 720x576
[SAR 64:45 DAR 16:9], 25 tbr, 25 tbn, 25 tbc
With libx264, I get the same SAR and DAR out.
sample_aspect_ratio=16:11
display_aspect_ratio=20:11
"
In above example the user asks for a NON BT.601 format 720x576 at
SAR 64:45 and a DAR related to 720x576 NOT 702 or 704x576 of 16:9
yet nvenc gives her something else
Here's the point where I give up and start my own fork where things
are done properly.
Michael Niedermayer
2015-01-18 13:55:01 UTC
Permalink
Post by Kieran Kunhya
Post by Michael Niedermayer
Post by Hendrik Leppkes
Post by Kieran Kunhya
Post by Kieran Kunhya
Oops I misunderstood, you mean the software must comply with the users
wishes.
Yes.
Post by Kieran Kunhya
Anyway my argument is that BT601 should be the default for the these
resolutions.
If you want, but that must happen immediately when the contents enters into
FFmpeg's data structure, not at a random point in the processing chain.
Without arguing for or against BT.601 behaviour...
nvenc should behave the same as libx264, or any other video encoder, if
this patch makes it do that, then it should be applied.
Yes, and AFAICS an earlier mail from philip indicates this is the case
"
Stream #0:0: Video: rawvideo (RGB[24] / 0x18424752), rgb24, 720x576
[SAR 64:45 DAR 16:9], 25 tbr, 25 tbn, 25 tbc
With libx264, I get the same SAR and DAR out.
sample_aspect_ratio=16:11
display_aspect_ratio=20:11
"
In above example the user asks for a NON BT.601 format 720x576 at
SAR 64:45 and a DAR related to 720x576 NOT 702 or 704x576 of 16:9
yet nvenc gives her something else
Here's the point where I give up and start my own fork where things
are done properly.
please elaborate about where you disagree

I think theres a misunderstanding between people on what some of the
aspect ratios mean and what they refer to

[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Let us carefully observe those good qualities wherein our enemies excel us
and endeavor to excel them, by avoiding what is faulty, and imitating what
is excellent in them. -- Plutarch
Kieran Kunhya
2015-01-18 14:13:00 UTC
Permalink
Post by Michael Niedermayer
Post by Kieran Kunhya
Post by Michael Niedermayer
Post by Hendrik Leppkes
Post by Kieran Kunhya
Post by Kieran Kunhya
Oops I misunderstood, you mean the software must comply with the users
wishes.
Yes.
Post by Kieran Kunhya
Anyway my argument is that BT601 should be the default for the these
resolutions.
If you want, but that must happen immediately when the contents enters into
FFmpeg's data structure, not at a random point in the processing chain.
Without arguing for or against BT.601 behaviour...
nvenc should behave the same as libx264, or any other video encoder, if
this patch makes it do that, then it should be applied.
Yes, and AFAICS an earlier mail from philip indicates this is the case
"
Stream #0:0: Video: rawvideo (RGB[24] / 0x18424752), rgb24, 720x576
[SAR 64:45 DAR 16:9], 25 tbr, 25 tbn, 25 tbc
With libx264, I get the same SAR and DAR out.
sample_aspect_ratio=16:11
display_aspect_ratio=20:11
"
In above example the user asks for a NON BT.601 format 720x576 at
SAR 64:45 and a DAR related to 720x576 NOT 702 or 704x576 of 16:9
yet nvenc gives her something else
Here's the point where I give up and start my own fork where things
are done properly.
please elaborate about where you disagree
I think theres a misunderstanding between people on what some of the
aspect ratios mean and what they refer to
Read
https://web.archive.org/web/20130328121216/http://lipas.uwasa.fi/~f76998/video/conversion/

and ETSI TS 101 154 (specifically the section on h264 aspect_ratio_idc):
www.etsi.org/deliver/etsi_ts/101100_101199/101154/01.11.01_60/ts_101154v011101p.pdf

Kieran
Nicolas George
2015-01-17 23:52:45 UTC
Permalink
Post by Kieran Kunhya
Post by Nicolas George
There is a very simple way of flagging content that is supposed to comply
with BT601: the SAR is 512/351. If SAR is 64/45, that means someone before
nvenc decided that the video is not expected to conform with BT601, and
nvenc has no right to decide otherwise.
You fail to realise this but SARs are not necessarily arithmetic for
historical reasons.
Care to explain how historical reasons can change the laws of geometry?

If a visible surface with a physical aspect ratio of 16/9 is cut into
702×576 identical rectangular pixels, then the aspect ratio of each pixel is
(16/9) / (702/576) = 512/351. Can you give one good reason for FFmpeg to
use, internally, any other value?

Regards,
--
Nicolas George
tim nicholson
2015-01-19 15:16:01 UTC
Permalink
This post might be inappropriate. Click to display it.
Nicolas George
2015-01-19 15:18:31 UTC
Permalink
Post by tim nicholson
I am struggling to follow some of the arguments, which seem far to
heated to be as useful as they could be.
Please see my other mail, I summarize it more extensively.
Post by tim nicholson
I am not sure what is currently wrong, and where, but please let us not
'fix' one scenario with a dirty hack that breaks hoers.
That is exactly what nvidia is doing and the patch is trying to un-do.

Regards,
--
Nicolas George
Kieran Kunhya
2015-01-19 23:58:05 UTC
Permalink
Post by tim nicholson
Taking the Sample/pixel aspect ratio (SAR) calculated above and feeding
that back into calculating a display aspect ratio *for the whole frame*
leads to a value that is not 16:9 (or 4:3), and it appears to be this
figure which is then reported as the DAR, rather than that of the active
picture. This is I think the disjoint between DAR and SAR that Kieran
refers to, there is only a mathematical relationship if you include the
overscan.
At least for the DVB/DVD/Blu-ray mandated aspect ratios (e.g 12:11 and
16:11 for PAL) you have to use further approximations (which is what
the nvidia driver is doing).

Kieran
Agatha Hu
2015-01-21 06:17:09 UTC
Permalink
Post by Philip Langdale
There is a long sad story behind all this, but it's somewhat ambiguous as to
whether DVD content should be treated as 720 pixels wide or 704 pixels, with
16 pixels cut off. If you decide is should be 704 pixels wide, you need to
adjust the sample aspect ratio to keep the final display aspect ratio correct.
For reasons we are not privy too, nvidia decided that the nvenc encoder should
apply this aspect correction, whether you want it to or not. (I guess there
might be a flag for it, but if there is it's not documented). So, if you want
to transcode DVD content at the original size, you need to adjust the aspect
ratio information you pass to the encoder to compensate for their 'correction'.
This 'correction' is only applied to 720x480 and 720x576 content - and it does
so regardless of the input aspect ratio.
---
libavcodec/nvenc.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/libavcodec/nvenc.c b/libavcodec/nvenc.c
index efa3f04..8a0d584 100644
--- a/libavcodec/nvenc.c
+++ b/libavcodec/nvenc.c
@@ -587,6 +587,18 @@ static av_cold int nvenc_encode_init(AVCodecContext *avctx)
ctx->init_encode_params.darWidth = avctx->width;
}
+ // De-compensate for hardware, dubiously, trying to compensate for
+ // playback at 704 pixel width.
+ if (avctx->width == 720 &&
+ (avctx->height == 480 || avctx->height == 576)) {
+ av_reduce(&dw, &dh,
+ ctx->init_encode_params.darWidth * 44,
+ ctx->init_encode_params.darHeight * 45,
+ 1024 * 1204);
+ ctx->init_encode_params.darHeight = dh;
+ ctx->init_encode_params.darWidth = dw;
+ }
+
ctx->init_encode_params.frameRateNum = avctx->time_base.den;
ctx->init_encode_params.frameRateDen = avctx->time_base.num * avctx->ticks_per_frame;
Here's the reply from NVENC engineers
It is not same thing as SAR. It is the display aspect ratio i.e
width/height = DAR/SAR

The calculation below should be like

If (avctx->sample_aspect_ratio.num > 0 &&
avctx->sample_aspect_ratio.den > 0 )
av_reduce(&dw, &dh, avctx->sample_aspect_ratio.num * avctx->width,
avctx->sample_aspect_ratio.den * avctx->height, INT_MAX);

else
av_reduce(&dw, &dh, avctx->width, avctx->height, INT_MAX);

ctx->init_encode_params.darHeight = dw;
ctx->init_encode_params.darWidth = dh;

Agatha Hu
Reimar Döffinger
2015-01-21 08:00:36 UTC
Permalink
Post by Agatha Hu
Post by Philip Langdale
There is a long sad story behind all this, but it's somewhat ambiguous as to
whether DVD content should be treated as 720 pixels wide or 704 pixels, with
16 pixels cut off. If you decide is should be 704 pixels wide, you need to
adjust the sample aspect ratio to keep the final display aspect ratio correct.
For reasons we are not privy too, nvidia decided that the nvenc encoder should
apply this aspect correction, whether you want it to or not. (I guess there
might be a flag for it, but if there is it's not documented). So, if you want
to transcode DVD content at the original size, you need to adjust the aspect
ratio information you pass to the encoder to compensate for their 'correction'.
This 'correction' is only applied to 720x480 and 720x576 content - and it does
so regardless of the input aspect ratio.
---
libavcodec/nvenc.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/libavcodec/nvenc.c b/libavcodec/nvenc.c
index efa3f04..8a0d584 100644
--- a/libavcodec/nvenc.c
+++ b/libavcodec/nvenc.c
@@ -587,6 +587,18 @@ static av_cold int nvenc_encode_init(AVCodecContext *avctx)
ctx->init_encode_params.darWidth = avctx->width;
}
+ // De-compensate for hardware, dubiously, trying to compensate for
+ // playback at 704 pixel width.
+ if (avctx->width == 720 &&
+ (avctx->height == 480 || avctx->height == 576)) {
+ av_reduce(&dw, &dh,
+ ctx->init_encode_params.darWidth * 44,
+ ctx->init_encode_params.darHeight * 45,
+ 1024 * 1204);
+ ctx->init_encode_params.darHeight = dh;
+ ctx->init_encode_params.darWidth = dw;
+ }
+
ctx->init_encode_params.frameRateNum = avctx->time_base.den;
ctx->init_encode_params.frameRateDen = avctx->time_base.num * avctx->ticks_per_frame;
Here's the reply from NVENC engineers
It is not same thing as SAR. It is the display aspect ratio i.e width/height = DAR/SAR
The calculation below should be like
If (avctx->sample_aspect_ratio.num > 0 && avctx->sample_aspect_ratio.den > 0 )
av_reduce(&dw, &dh, avctx->sample_aspect_ratio.num * avctx->width, avctx->sample_aspect_ratio.den * avctx->height, INT_MAX);
That doesn't answer anything.
Doing that, to keep the problem description short, creates different results from libx264 and any other encoder we have for DVD resolution which is highly undesirable for consistency and compatibility reasons if nothing else.
Agatha Hu
2015-01-22 04:15:44 UTC
Permalink
Post by Reimar Döffinger
Post by Agatha Hu
Here's the reply from NVENC engineers
It is not same thing as SAR. It is the display aspect ratio i.e width/height = DAR/SAR
The calculation below should be like
If (avctx->sample_aspect_ratio.num > 0 && avctx->sample_aspect_ratio.den > 0 )
av_reduce(&dw, &dh, avctx->sample_aspect_ratio.num * avctx->width, avctx->sample_aspect_ratio.den * avctx->height, INT_MAX);
That doesn't answer anything.
Doing that, to keep the problem description short, creates different results from libx264 and any other encoder we have for DVD resolution which is highly undesirable for consistency and compatibility reasons if nothing else.
_______________________________________________
ffmpeg-devel mailing list
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
We will fix the issue in driver, overscan compensation will be applied
to input DAR *only* if the DAR is 4:3 or 16:9, otherwise won't
unnecessarily modify the aspect ratio for resolutions like 720x480 and
720x576.
The fix will be on future driver, I'll let you know when the branch is
released.

Agatha Hu
Philip Langdale
2015-01-22 04:29:25 UTC
Permalink
On Thu, 22 Jan 2015 12:15:44 +0800
Post by Agatha Hu
We will fix the issue in driver, overscan compensation will be
applied to input DAR *only* if the DAR is 4:3 or 16:9, otherwise
won't unnecessarily modify the aspect ratio for resolutions like
720x480 and 720x576.
The fix will be on future driver, I'll let you know when the branch
is released.
Please, no. We do not want overscan compensation to be done by nvenc at
all. The compensation makes nvenc inconsistent with *all* other
encoders in ffmpeg. For every other encoder, if my input is 720x576 at
[64/45], then my output is the same. It does not do anyone any favours
for nvenc to implicitly alter the output DAR.

There is a separate discussion about whether DAR 'compensation' is
desirable in an encoder or not, but even if that discussion results in
a decision to apply DAR compensation at encode time, it will be done
globally in ffmpeg for all encoders - ie: at a layer above any
individual codec.

If nvidia feels it is important to offer this compensation 'feature',
then I would ask for a configuration option to turn it off - otherwise
we will need the 'de-compensation' logic in the patch I posted.

Thanks,

--phil
Agatha Hu
2015-01-22 05:12:59 UTC
Permalink
Post by Philip Langdale
On Thu, 22 Jan 2015 12:15:44 +0800
Post by Agatha Hu
We will fix the issue in driver, overscan compensation will be
applied to input DAR *only* if the DAR is 4:3 or 16:9, otherwise
won't unnecessarily modify the aspect ratio for resolutions like
720x480 and 720x576.
The fix will be on future driver, I'll let you know when the branch
is released.
Please, no. We do not want overscan compensation to be done by nvenc at
all. The compensation makes nvenc inconsistent with *all* other
encoders in ffmpeg. For every other encoder, if my input is 720x576 at
[64/45], then my output is the same. It does not do anyone any favours
for nvenc to implicitly alter the output DAR.
There is a separate discussion about whether DAR 'compensation' is
desirable in an encoder or not, but even if that discussion results in
a decision to apply DAR compensation at encode time, it will be done
globally in ffmpeg for all encoders - ie: at a layer above any
individual codec.
If nvidia feels it is important to offer this compensation 'feature',
then I would ask for a configuration option to turn it off - otherwise
we will need the 'de-compensation' logic in the patch I posted.
Thanks,
--phil
Disabling compensation is the best result, adding a flag is acceptable,
but NVENC5.0 has just released, such option won't be available until 6.0
is available, so...
Still waiting for nvenc teams feedback, the worst case is adding the
'de-compensation' logic for 4:3 and 16:9

Agatha Hu
Philip Langdale
2015-01-22 17:22:32 UTC
Permalink
On Thu, 22 Jan 2015 13:12:59 +0800
Post by Agatha Hu
Disabling compensation is the best result, adding a flag is
acceptable, but NVENC5.0 has just released, such option won't be
available until 6.0 is available, so...
Still waiting for nvenc teams feedback, the worst case is adding the
'de-compensation' logic for 4:3 and 16:9
If they can get a flag in eventually, that's still a good improvement.
We'll need the de-compensation logic in the short term, regardless of
anything else.

Thanks,

--phil

Reimar Döffinger
2015-01-22 04:41:11 UTC
Permalink
This post might be inappropriate. Click to display it.
Reimar Döffinger
2015-01-18 17:41:09 UTC
Permalink
Post by Philip Langdale
There is a long sad story behind all this, but it's somewhat ambiguous as to
whether DVD content should be treated as 720 pixels wide or 704 pixels, with
16 pixels cut off. If you decide is should be 704 pixels wide, you need to
adjust the sample aspect ratio to keep the final display aspect ratio correct.
For reasons we are not privy too, nvidia decided that the nvenc encoder should
apply this aspect correction, whether you want it to or not. (I guess there
might be a flag for it, but if there is it's not documented). So, if you want
to transcode DVD content at the original size, you need to adjust the aspect
ratio information you pass to the encoder to compensate for their 'correction'.
This 'correction' is only applied to 720x480 and 720x576 content - and it does
so regardless of the input aspect ratio.
Oh my god, this is so horribly broken I vote for doing a "return
-EBROKENAPI" and tell them we'll enable this when NVidia fixes their
stuff.
IMHO some features are not worth the hacks necessary.
compn
2015-01-19 00:16:55 UTC
Permalink
On Sun, 18 Jan 2015 18:41:09 +0100
Post by Reimar Döffinger
Oh my god, this is so horribly broken I vote for doing a "return
-EBROKENAPI" and tell them we'll enable this when NVidia fixes their
stuff.
IMHO some features are not worth the hacks necessary.
[08:51] <compn> just let nvidia change aspect, its their crap, let them
fix it

[08:57] <kierank> compn: the nvidia behaviour is correct
[08:57] <kierank> so no patches are necessary

-compn
Reimar Döffinger
2015-01-19 06:48:24 UTC
Permalink
Post by compn
On Sun, 18 Jan 2015 18:41:09 +0100
Post by Reimar Döffinger
Oh my god, this is so horribly broken I vote for doing a "return
-EBROKENAPI" and tell them we'll enable this when NVidia fixes their
stuff.
IMHO some features are not worth the hacks necessary.
[08:51] <compn> just let nvidia change aspect, its their crap, let them
fix it
[08:57] <kierank> compn: the nvidia behaviour is correct
[08:57] <kierank> so no patches are necessary
I don't really consider it "correct" if it doesn't do what a user expects, plus as I understood the discussion neither ffplay nor MPlayer nor VLC and probably WMP, flash player etc. will not show the video correctly if encoding e.g. a BluRay to DVD resolution via nvenc and playing it back.
If they do I'll have to reconsider, but if they don't it isn't working, "correct" or not.
Philip Langdale
2015-01-19 07:10:40 UTC
Permalink
On Mon, 19 Jan 2015 07:48:24 +0100
Post by Reimar Döffinger
Post by compn
On Sun, 18 Jan 2015 18:41:09 +0100
Post by Reimar Döffinger
Oh my god, this is so horribly broken I vote for doing a "return
-EBROKENAPI" and tell them we'll enable this when NVidia fixes
their stuff.
IMHO some features are not worth the hacks necessary.
[08:51] <compn> just let nvidia change aspect, its their crap, let
them fix it
[08:57] <kierank> compn: the nvidia behaviour is correct
[08:57] <kierank> so no patches are necessary
I don't really consider it "correct" if it doesn't do what a user
expects, plus as I understood the discussion neither ffplay nor
MPlayer nor VLC and probably WMP, flash player etc. will not show the
video correctly if encoding e.g. a BluRay to DVD resolution via nvenc
and playing it back. If they do I'll have to reconsider, but if they
don't it isn't working, "correct" or not.
Right. And it's obviously wrong if you consider that trying to encode
any square pixel content at this size results in non-square output.

I'd also argue that bt.601 semantics come into play when you're
authoring the DVD, and any transcoding you do after that should
preserve the input characteristics - and that means the encoded SAR is
based on a 720 pixel width. Why on earth my transcoded file should have
a different SAR from the original when doing a digital->digital
conversion is beyond me.

--phil
Nicolas George
2015-01-19 13:42:31 UTC
Permalink
Post by Philip Langdale
Right. And it's obviously wrong if you consider that trying to encode
any square pixel content at this size results in non-square output.
I'd also argue that bt.601 semantics come into play when you're
authoring the DVD, and any transcoding you do after that should
preserve the input characteristics - and that means the encoded SAR is
based on a 720 pixel width. Why on earth my transcoded file should have
a different SAR from the original when doing a digital->digital
conversion is beyond me.
I believe part of Kieran's point is: you do not encode to anamorphic 720×576
unless you are actually authoring a DVD, and in that case it should conform
to BT.601.

This makes sense: if you are starting with nice square pixels from HD
contents, why on Earth would you stretch them to an absurd resolution?

On the other hand, if you are authoring a DVD, you are encoding to MPEG-2,
not H.264, and therefore you are not using nvenc, so this is still absurd.

And of course, the fact that is behaves differently than every single other
encoder makes it a braindead design, even if the result is, in some regards,
considered correct.

Regards,
--
Nicolas George
Nicolas George
2015-01-19 13:33:52 UTC
Permalink
Post by compn
[08:57] <kierank> compn: the nvidia behaviour is correct
It would help a lot if he had the courtesy of actually explaining why he
considers the behaviour to be correct. After all, a lot of people in this
thread, possibly not as smart and knowledgeable as him but not stupid
either, have explained why they think not only it is incorrect but utterly
absurd, so the issue requires explanations.

In the meantime, I can try to summarise the terms of the problem to make
sure we are all talking about the same thing. Please anyone stop me if you
do not agree.


Considering an image (given as a rectangular of pixel values in memory), to
display it correctly, exactly one piece of extra geometric information is
needed: the shape of the physical oriented rectangle for the image.

That shape can be expressed as the ratio between the width and the height of
the rectangle, called "aspect ratio" of the rectangle. Let us call it
PhR_AR.

It can also be expressed in a number of other ways, but they all amount to
the same information, and therefore, for any given image, any of these way
can be derived from any other arithmetically. Of course, the derivation can
make use of the logical properties of the image, especially the number of
pixels in it (= its resolution).

One of these ways, in particular, is frequently used: the aspect ratio of
the physical pixels that build the physical image, assuming the physical
image is made of perfectly rectangular identical pixels. Let us call it
PhP_AR.

The relation between PhR_AR and PhP_AR is purely geometric, and could be
found by a high-school student:

PhR_AR = PhP_AR × (Nw/Nh)

where Nw and Nh are the number of physical pixels in the width and height
direction respectively.

PhR_AR is convenient when the image is logically scaled to change the number
of pixels that have to be displayed, because it is constant in this case.

PhP_AR is convenient when the borders of image are changed (by cropping or
padding) because it is constant in this case.

When reading aspect ratio information from files or bitstreams, or when
writing it, FFmpeg should conform exactly to the standard that specifies
said files or bitstreams.

On the other hand, the way FFmpeg codes the information in its internal data
structures, in its API and in its log messages is entirely for us to decide,
based only on considerations of making the API stable and convenient.

The choice that has been done is to use a field called
"sample_aspect_ratio", SAR in short, to store what I called PhP_AR.

FFmpeg also sometimes prints something called "display aspect ratio", DAR,
that is (approximatively, see below) what I called PhR_AR.

When decoding a MPEG-2 stream from a "widescreen PAL DVD", the MPEG-2
decoder will output images of 720×576 pixels.

Assuming Kieran's interpretation is correct, to correctly display these
images, software must only keep the middle 702×576 pixels and display them
on a physical rectangle with 16/9 aspect ratio.

Conversely, someone who wants to produce a "widescreen PAL DVD" must encode
the video with the same parameters.

The same implicit cropping applies to an indefinite number of situations
broadly called "conforming to BT.601".

But it does not apply to all situations, and in particular not to videos
that are intended for purely computerized use.

There are no fields, in FFmpeg data structure, to encode the implicit
cropping mandated by BT.601. This could be changed.

Since FFmpeg does not know about implicit cropping, when it prints DAR, it
assumes the number of physical pixels is the number of logical pixels in the
image.

Still assumes Kieran's interpretation, and still for a "widescreen PAL DVD",
we have PhR_AR = 16:9, Nw = 702 and Nh = 576.

That makes PhP_AR = 512:351.

Therefore, when a BT.601 video is stored in FFmpeg's data structures, the
data structures should hold SAR = 512:351.

This must happen as soon as the data structure is initialized, or at least
as soon as it is decided that the video is assumed to conform to BT.601.

Until FFmpeg knows about implicit cropping, it will print
DAR = (512:351) × (720/576) = 640:351.

A good API should be consistent and avoid surprises.

As a consequence, the individual encoders in FFmpeg should respect the
fields in the data structures scrupulously, including the SAR.

If heuristics are applied to detect a BT.601 video, they must happen in
common parts of the code, strictly before individual encoders, and they must
act by updating SAR and/or setting future dedicated fields.

The nvenc wrapper, without Philip's patch, violates these guidelines. With
Philip's patch, it works as expected and like other encoders.

I believe this summarizes all, sorry for the lengthy message.

Regards,
--
Nicolas George
Loading...