Back Forum Reply New

How far does ME in XVID look while searching for motion vector?

:/

Subj. Also, could it be that higher VHQ settings force codec to look futher for best match?

There are three steps for motion search:
1. Check all 'predictions'. The predictions are all vectors which are already available and look like a good starting point: zero, equal to real prediction, equal to the vectors on top, to the left, equal to the vector in previous frame etc.
2. Do a gradient-descent search starting with the best position found so far. It simply means that four (diamond) or eight(square) neighbouring positions will be checked. If any of them is better then current, we change the current to the new position and repeat etc etc.
3. Sub-pixel refinement will check eight half-pixel positions, then eight quarter-pixel positions (if in qpel).

As you can see, there is no limit to motion search range, other than picture boundary and the maximum vector lenght that can be transmited in the bitstream.

Nothing to increase

Radek

Ah, but the search range is limited by the closest minima  Global and local alike.

@sysKin

Then could VHQ be guilty in those cases where extra-long motion vectors are dumped in favor of intra macroblocks?

Originally posted by OUTPinged_
Then could VHQ be guilty in those cases where extra-long motion vectors are dumped in favor of intra macroblocks?

There are two possibilities:
1. motion was too difficult for ME and it didn't find the right vector. Intra block is a smart choice then.
2. ME found the vector, but VHQ decided that it wasn't worth coding, it was better to use intra block. It's perfectly ok becuase VHQ _is_ an advanced, R-D based mode decision and _it's always right_ unless it's buggy

So VHQ was probably responsible, and it was right to do that - it's always right

Radek

and it was right to do that - it's always right

Oh, please put more VHQ inside XviD  !!

Originally posted by sysKin
....It's perfectly ok becuase VHQ _is_ an advanced, R-D based mode decision and _it's always right_ unless it's buggy

No, it's not always right, for two reasons: 1, PSNR is not the optimal distortion measurement for the human eye and 2, You don't even attempt a trellis encoding which would exploit macroblock decision interdependencies.

I should explain 2 a little better.  Let's say VHQ choose an Intra macroblock at MB position (x, y).  But when you come to encode MB (x, y+1) you might find that it would have been better overall to use an Inter block at (x, y) because it would have provided a better predictor.  The (nearly optimal) solution would be Viterbi/trellis VHQ but this would be very, very, very excrutiatingly slow.

Originally posted by temporance
You don't even attempt a trellis encoding which would exploit macroblock decision interdependencies.

Well yes, that's true - it's always right for particular macroblock, not for all macroblocks. However, taking future MBs into account - here or anywhere else - would mean multi-pass motion search and would be extremly slow . Also, motion data take about 10% of all data (at quantizers 3-4) so I doubt we could gain much.
Also note that a single MB being intra doesn't spoil the predictor too much, because predictor is a median. Median of 10, 11 and 15 is 11, of 0, 11 and 15 is still 11

Radek

Originally posted by sysKin
There are three steps for motion search:
1. Check all 'predictions'. The predictions are all vectors which are already available and look like a good starting point: zero, equal to real prediction, equal to the vectors on top, to the left, equal to the vector in previous frame etc.You aren't doing an acceleration vector?

Originally posted by SirDavidGuy
You aren't doing an acceleration vector?

Wow - what a compelling thought (even though I know very little of ME). Estimating a motion vector, by extending the difference of the previous two motion vectors.  Actually this is quite in the spirit of B-frames.

(Edit: Clarification)

@syskin:
Median of 10, 11 and 15 is 11,

okay that's 36 div 3, that's 11, I get that
butof 0, 11 and 15 is still 11

that's 26 and div 3 it's not 11 but more like 7

Or am I missing something here?

Cu Selur

Ps.: probably just a typo,...

He was talking about the median, not the average (mean = average).

The median is the middle element in a sorted list.  

So median {1,2,3} = 2  median (1,2,100) = still 2

One advantage of the median is your answer is guaranteed to be one of your original values.  

To be pedantic:  mean, median and mode are all averages.  If someone says average, they probably mean mean(!).

Originally posted by SirDavidGuy
You aren't doing an acceleration vector?

No, it was just slow without any improvement...

As for median, FastMike is correct: median is the value which is 'in the middle'

Radek

One advantage of the median is your answer is guaranteed to be one of your original values.

As for median, FastMike is correct: median is the value which is 'in the middle'

That's not always true, since when you have an even number of entries there's no 'middle' term, in this case median is the arithmetic mean of the most central terms: Median(0,1,2,3,4,5,6,7,8,9)=4.5
And this is no longer a value of the serie unless both central terms are the equal.

just my two cents  

Originally posted by TaZ4hvn
That's not always true, since when you have an even number of entries there's no 'middle' term, in this case median is the arithmetic mean of the most central terms: Median(0,1,2,3,4,5,6,7,8,9)=4.5
And this is no longer a value of the serie unless both central terms are the equal.

Yes, but it doesn't happen in mpeg-4
Prediction is always a median of 3 vectors. If any one of them is not available, it's set to 0,0. If two are not available, they are set to the third. If all three are not available, they are all zeros...
¥
Back Forum Reply New