It's a fantastic question, and a most difficult one to solve, given that we have to assume certain things that cannot be proven: external world, permanent objects, other minds, etc.
At some level all perception is direct. For the direct realist, the man directly perceives a tree. X directly perceives Y. — NOS4A2
I've struggled quite a bit with this formulation, but discussion here and elsewhere makes me want to state this in a different manner:
Most perception is direct (edge cases aside), we directly perceive a tree. But direct perception does not imply unmediated perception. In fact, it wouldn't be possible to have any experience if we did not mediate it, we would be like lumps of clay, capable of no cognition or perception.
So, we can say we have direct
mediated perception and differentiate that from direct un-mediated perception, which is sometimes obscurely implied when speaking about "direct realism".
something within the man (the mind, the brain, a little man) directly perceives something else within the man (sense data, representation, idea) — NOS4A2
This is the hard issue, subject to the most varied types of interpretations.
I'd say, we, as human being - biological organisms - perceive sense-data, caused by objects. The "I" we use to denote a self is a "fiction" - in Hume's sense, roughly something we postulate that goes beyond what can be asserted given the evidence we have at our disposal.
But then this "I" is just as fictitious as a "rock" or a "tree." Put rather simply, a bird or a snake does not perceive these things AS "rocks" or "trees" - they lack the relevant symbolic system that allows for human language and concept formation.
That's roughly my take on it, but there's a lot more that could be said...