Skip to content

Unable to reproduce inference speed results #2

@bergstar

Description

@bergstar

I guess I need to force it onto GPU. But how to do so?

(base) nvidia@gx10-9074:~/Development/vibevoice.cpp$ time ./build/bin/vibevoice-cli asr   --model     models/vibevoice-asr-q4_k.gguf   --tokenizer models/tokenizer.gguf   --audio     2p_argument.wav
asr: loaded 1644800 samples (68.53s)
[vv I] backend: CPU
[vv I] loaded models/vibevoice-asr-q4_k.gguf: 1177 tensors, 22 kv (backend=CPU)
[vv I] vibevoice_load: hidden=3584  layers=28+0  vocab=152064  scaling=0.0000 bias=0.0000
[vv I] loaded models/tokenizer.gguf: 0 tensors, 13 kv (no tensor data)
[vv I] Tokenizer: loaded 151665 tokens, 151387 merges, 14 special
[{"Start":0,"End":12.24,"Speaker":0,"Content":"I can't believe you did it again. I waited for two hours. Two hours! Not a single call, not a text. Do you have any idea how embarrassing that was? Just sitting there alone?"},{"Start":12.4,"End":23.17,"Speaker":1,"Content":"Look, I know, I'm sorry, alright? Work was a complete nightmare. My boss dropped a critical deadline on me at the last minute. I didn't even have a second to breathe, let alone check my phone."},{"Start":23.17,"End":34.24,"Speaker":0,"Content":"A nightmare? That's the same excuse you used last time. I'm starting to think you just don't care. It's easier to say work was crazy than to just admit that I'm not a priority for you anymore."},{"Start":34.24,"End":45.49,"Speaker":1,"Content":"That's not fair. Of course you're a priority. You think I enjoyed being stuck in that office, drowning in spreadsheets while knowing I was letting you down?
asr: timing  load=4.6s  inference=180.9s  audio=68.5s  RTF=2.640

real	3m6.492s
user	5m56.223s
sys	3m0.181s
(base) nvidia@gx10-9074:~/Development/vibevoice.cpp$

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions