I'm just going to answer in terms of (what I understand of) Metzinger's approach, since it is amenable to me and is materialist while taking phenomenal selfhood seriously.
It would be a cluster concept if none of those 'types of conscious states' had one essential defining feature. But they do. They're all phenomenally conscious in the sense that there's something it is like to be in them. — bert1
To my knowledge, Metzinger disagrees with the inference you just made. Specifically he claims that there can be "core components" of phenomenal selfhood and phenomenal experience which are universally shared, but nevertheless the concept is a cluster concept. That universal aspect of phenomenal selfhood he calls "minimal phenomenal selfhood", and the universal aspect of experience is "minimal phenomenal experience".
With those terms in place, I think he also strongly disagrees with the claim that "minimal phenomenal selfhood" and "minimal phenomenal experience" contain anything like what qualists intend by "what is it like" states.
That said, if one of them doesn't have the the feature of phenomenal consciousness (say, a robot (or zombie or whatever) creating a model of the world it can use to make predictions 'in the dark'), then it's not a conscious state in that sense. Phenomenal consciousness picks out exactly one feature/property and one feature only, the presence of which is essential to the definition. That sense of 'consciousness' isn't a cluster concept.
I believe there's a connotation in there, in contrasting phenomenal consciousness (in humans) to an absence of phenomenal consciousness (in robots), it construes consciousness as a binary property - on or off. Which might be true for Metzinger, but only for minimal phenomenal experience/selfhood - if someone is said to have experience or selfhood at all, they will have minimal phenomenal experience/selfhood (definitionally). The content and structure of such a state is left unanalysed (so far in this thread at least) save for the assertion that it consists of "what is it like" states, or even a general impression of "what is it like" over state-aggregates in a unified phenomenal experience (to be disambiguated).
The "rub" of making these distinctions is that what a qualist may construe as a definitive of phenomenal experience/selfhood may turn out to be
too much - in that it contains unnecessary structures or types of content. How "what is it like" relevant states are construed by intersect with that non-necessary content. Those structures of experience that come with qualia that do not come with minimal phenomenal experiences.
Metzinger's account of minimal phenomenal experience (MPE) extracts 6 constraints that phenomenal experience must satisfy.
Wakefulness) The phenomenal character of tonic alertness (see section 3.1).
*Metzinger clarifies tonic alertness as:
"Put differently, an organism can be tonically alert without knowing that it is alert: Consciousness is knowing that one is alert. An organism can embody a rich space of epistemic capacities without having an internal model of this fact."
What that puts me in mind of is the ongoing feed of downtuned sensations from my back when I'm laying down and drifting off to sleep. That "floating on a cloud" feeling. I am receptive/modelling the sensations of my back on the bed without having an awareness that I am doing so (an internal representation of that representation, as it were).
Low Complexity) often described as the complete absence of intentional content, in particular of high-level symbolic mental content (i.e., discursive, conceptual, or propositional thought), but also of sensorimotor or affective content
Self-luminosity): a phenomenal property instantiated during some MPE episodes, typically described as “radiance”, “brilliance”, or the “clear light” of primordial awareness.
*(Metzinger clarifies this as the "functional autonomy of tonic alertness" - it's a process that goes on all the time, and it doesn't care if you currently have a self, ego, are conscious etc. Luminosity seems to be a form of pre-perspectival attunement and registration of bodily signals, the pre-self building blocks of individuated "sense impressions" which come to take on conceptual and qualitative character when filtered and chunked through internal modelling. I'm thinking of them as the feelings which just slip away before they're there!
Introspective availability) We can sometimes actively direct introspective attention to consciousness as such and we can distinguish possible states by the degree of actually ongoing access.
Epistemicity) The phenomenal experience of knowing, which comes in degrees and can also be described as the subjective quality of confidence
Transparency/Opacity) Like all other phenomenal representations, MPE can vary along a spectrum of opacity and transparency
*(transparency is degree to which a phenomenal state is not experienced as a representation) - me)
Broadly construed, this is "awareness of awareness" without "awareness of the individuated content of awareness". In that state, the body's self-modelling processes which normally would intend, judge, see red apples, and be aware that "I" need to eat breakfast don't occur. Awareness without the cognitive retrojection of objects and a self identifying perspective subsuming them. That does not resemble anything like qualia, does it? There's little room for an embodied, self aware agent with mind states directed towards objects' properties in that construal. At least in the way qualists contend.
Which isn't to say the kind of states that qualists think of are impossible, just that those states perhaps aren't the essential characteristics of consciousness. Insofar as one can have phenomenal experience without anything resembling a quale (as often construed).
At the very least, I think Metzinger's efforts put the ball in qualists' courts for trying to show that those states are irreducible and primitive. In particular he suggests the following list that characterises "minimal phenomenal experience'
Paper here.
And if we're looking for evidence of such a thing - I think my list covers the bases. Though it isn't tailored to discriminate each facet individually.