Hallo是一个由复旦大学、百度公司、苏黎世联邦理工学院和南京大学共同开发的AI对口型肖像图像动画技术,该技术能够基于语音音频输入生成逼真且动态的肖像图像视频。以下是该项目的详细介绍:
项目背景:
肖像图像动画技术:该技术旨在从单个静态图像和相应的语音音频中生成一个说话的人像,在视频游戏和虚拟现实、电影和电视制作、社交媒体和数字营销等领域具有巨大价值。
传统方法限制:过去,由于缺乏有效的声音到视频生成方案,人脸视频合成通常需要依赖参数化模型作为中间媒介,但这些方法常常受制于参数化模型在表情和动作表达能力上的限制,以及声音与动作之间的弱相关性。
Hallo技术特点:
端到端模型:Hallo是一个直接从声音驱动生成视频的端到端模型,无需复杂的参数化中间表示和额外的动作输入,即可生成口型、表情、动作极其自然丰富的人脸视频。
分层音频驱动视觉合成:该模型采用了分层音频驱动的视觉合成模块,通过分层交叉注意力操作,针对不同区域(嘴唇、面部和头部)分别提取掩码特征,并学习到不同区域的运动特征,从而显著提升口型、表情和动作的真实度。
高质量人脸动画生成:在真人数据集上,Hallo展示出了高度一致的口型,并能够体现出音频的丰富细节,如情绪和讲话节奏。
技术架构:
网络架构整合:Hallo的网络架构整合了基于扩散的生成模型、基于UNet的去噪器、时间对齐技术和参考网络,以增强动画的质量和真实感。
人脸编码模型:使用预先训练的人脸编码器来提取身份特征,这些特征与扩散网络的交叉注意力模块进行交互,生成与输入角色特征忠实一致的肖像动画。
数据清洗与训练:
自动化数字人视频清洗引擎:为了解决互联网上存在的大量数字人视频数据质量参差不齐的问题,研发团队构建了一套自动化数字人视频清洗引擎,已成功清洗了数千小时的高质量数字人视频。
多类型人像风格支持:尽管Hallo仅在真人视频数据集上进行训练,但表现出了极强的泛化性,包括卡通、素描、雕塑等各类风格。
全局运动可控性:
分层面部特征注意力机制:Hallo利用该机制,通过调整三个区域的权重系数,能针对性地控制口型、表情和动作的运动强度,从而大幅提升人脸动画生成的可控性。
应用前景:
影视制作:Hallo在娱乐产业方面,可在电影、电视剧和短视频制作中发挥重要作用,提高制作效率,实现更高质量的动画效果。
游戏与虚拟现实:通过引入AI角色,游戏和虚拟现实应用可以呈现更生动、真实的虚拟世界,增强用户的沉浸感和参与感。
教育领域:AI数字人能通过多感官交互增加学习的直观性和互动性,为弱势人群提供更符合其需求的教育内容。
开源与社区合作:
开源共享:Hallo项目已经开源,并提供了详细的部署过程和Web界面演示,方便社区成员进行二次开发和应用。
社区合作:复旦和百度的研究团队将持续优化模型性能,提升动画生成质量,并期待与社区紧密合作,共同推动该技术在多个产业领域的应用和发展。
总结来说,Hallo是一个创新的AI对口型肖像图像动画技术,通过端到端模型和分层音频驱动视觉合成模块,实现了高质量、高真实感的人脸视频生成。该项目在影视制作、游戏与虚拟现实以及教育领域具有广泛的应用前景,并已经开源共享,期待与社区共同推动其发展和应用。
下面是我运行的截图
我随手做了2个演示
你们看看
简单说下这个开源项目的使用方法
我已经做了一个简单的实例 只要运行 run.bat 就可以把input文件夹下面的1.png和1.wav生成数字人视频保存在output
如果你要改变输出结果和运行不同的功能 可以去修改configs 这个文件夹里面的.yaml文件 可以用记事本打开
大佬们,请问这是什么意思
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
W0000 00:00:1731287804.803928 8092 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
W0000 00:00:1731287804.814413 3116 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
#face is invalid: 0
Traceback (most recent call last):
File "E:\hallo2\scripts\inference_long.py", line 508, in
save_path = inference_process(command_line_args)
File "E:\hallo2\scripts\inference_long.py", line 216, in inference_process
source_image_lip_mask = image_processor.preprocess(
File "E:\hallo2\hallo\datasets\image_processor.py", line 141, in preprocess
get_mask(source_image_path, cache_dir, face_region_ratio)
File "E:\hallo2\hallo\utils\util.py", line 556, in get_mask
get_lip_mask(landmarks, height, width, os.path.join(
File "E:\hallo2\hallo\utils\util.py", line 464, in get_lip_mask
lip_landmarks = np.take(landmarks, lip_ids, 0)
File "E:\hallo2\jian27\lib\site-packages\numpy\core\fromnumeric.py", line 192, in take
return _wrapfunc(a, 'take', indices, axis=axis, out=out, mode=mode)
File "E:\hallo2\jian27\lib\site-packages\numpy\core\fromnumeric.py", line 59, in _wrapfunc
return bound(*args, **kwds)
IndexError: cannot do a non-empty take from an empty axes.
Press any key to continue . . .
你AI 环境搭建好了没?
怎么搭呀
https://www.myhelen.cn/helen/259.htm 这是教程 安装好cuda11.8和对应的cudnn
To create a public link, set `share=True` in `launch()`.
save path: ./output_long/debug/720-1bi1
2024-10-25 16:56:08.5048404 [E:onnxruntime:Default, provider_bridge_ort.cc:1744 onnxruntime::TryGetProviderInfo_CUDA] C:\a\_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1426 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\onnxruntime\capi\onnxruntime_providers_cuda.dll"
*************** EP Error ***************
EP Error C:\a\_work\1\s\onnxruntime\python\onnxruntime_pybind_state.cc:866 onnxruntime::python::CreateExecutionProviderInstance CUDA_PATH is set but CUDA wasnt able to be loaded. Please install the correct version of CUDA andcuDNN as mentioned in the GPU requirements page (https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements), make sure they're in the PATH, and that your GPU is supported.
when using ['CUDAExecutionProvider', 'CPUExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
****************************************
2024-10-25 16:56:08.7958579 [E:onnxruntime:Default, provider_bridge_ort.cc:1744 onnxruntime::TryGetProviderInfo_CUDA] C:\a\_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1426 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\onnxruntime\capi\onnxruntime_providers_cuda.dll"
Traceback (most recent call last):
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 419, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 483, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
RuntimeError: C:\a\_work\1\s\onnxruntime\python\onnxruntime_pybind_state.cc:866 onnxruntime::python::CreateExecutionProviderInstance CUDA_PATH is set but CUDA wasnt able to be loaded. Please install the correct version of CUDA andcuDNN as mentioned in the GPU requirements page (https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements), make sure they're in the PATH, and that your GPU is supported.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\gradio\queueing.py", line 622, in process_events
response = await route_utils.call_process_api(
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\gradio\route_utils.py", line 323, in call_process_api
output = await app.get_blocks().process_api(
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\gradio\blocks.py", line 2014, in process_api
result = await self.call_function(
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\gradio\blocks.py", line 1567, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\anyio\_backends\_asyncio.py", line 2441, in run_sync_in_worker_thread
return await future
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\anyio\_backends\_asyncio.py", line 943, in run
result = context.run(func, *args)
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\gradio\utils.py", line 846, in wrapper
response = f(*args, **kwargs)
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\gradio\utils.py", line 846, in wrapper
response = f(*args, **kwargs)
File "app.py", line 37, in app.predict
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\scripts\inference_long.py", line 210, in inference_process
with ImageProcessor(img_size, face_analysis_model_path) as image_processor:
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\hallo\datasets\image_processor.py", line 100, in __init__
self.face_analysis = FaceAnalysis(
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\insightface\app\face_analysis.py", line 31, in __init__
model = model_zoo.get_model(onnx_file, **kwargs)
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\insightface\model_zoo\model_zoo.py", line 96, in get_model
model = router.get_model(providers=providers, provider_options=provider_options)
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\insightface\model_zoo\model_zoo.py", line 40, in get_model
session = PickableInferenceSession(self.onnx_file, **kwargs)
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\insightface\model_zoo\model_zoo.py", line 25, in __init__
super().__init__(model_path, **kwargs)
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 432, in __init__
raise fallback_error from e
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 427, in __init__
self._create_inference_session(self._fallback_providers, None)
File "L:\ai-ruanjian-2023-09\hallo2-2024-10-25\hallo2\python\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 483, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
RuntimeError: C:\a\_work\1\s\onnxruntime\python\onnxruntime_pybind_state.cc:866 onnxruntime::python::CreateExecutionProviderInstance CUDA_PATH is set but CUDA wasnt able to be loaded. Please install the correct version of CUDA andcuDNN as mentioned in the GPU requirements page (https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements), make sure they're in the PATH, and that your GPU is supported.
没有安装好cuda和cudnn
cuda安装的是12.61,cudnn9.5.1
试试cuda11.8 以及对应的cudnn
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.2.2+cu121 with CUDA 1201 (you have 2.2.2+cu118)
Python 3.10.11 (you have 3.10.15)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
D:\BaiduNetdiskDownload\hallo2\jian27\lib\site-packages\pydub\utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
save path: ./output/1
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1'}, 'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1'}, 'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1'}, 'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1'}, 'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\glintr100.onnx recognition ['None', 3, 112, 112] 127.5 127.5
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1'}, 'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\scrfd_10g_bnkps.onnx detection [1, 3, '?', '?'] 127.5 128.0
set det-size: (640, 640)
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1729764998.540941 34172 face_landmarker_graph.cc:174] Sets FaceBlendshapesGraph acceleration to xnnpack by default.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
W0000 00:00:1729764998.554655 32040 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
W0000 00:00:1729764998.562839 19228 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
Processed and saved: ./output/1\1_sep_background.png
Processed and saved: ./output/1\1_sep_face.png
Some weights of Wav2VecModel were not initialized from the model checkpoint at ./pretrained_models/wav2vec/wav2vec2-base-960h and are newly initialized: ['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1', 'wav2vec2.masked_spec_embed']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
2024-10-24 18:16:38,957 - INFO - separator - Separator version 0.17.2 instantiating with output_dir: ./output/1\audio_preprocess, output_format: WAV
2024-10-24 18:16:38,958 - INFO - separator - Operating System: Windows 10.0.19045
2024-10-24 18:16:38,959 - INFO - separator - System: Windows Node: DESKTOP-BNJA2AM Release: 10 Machine: AMD64 Proc: AMD64 Family 23 Model 113 Stepping 0, AuthenticAMD
2024-10-24 18:16:38,959 - INFO - separator - Python Version: 3.10.15
2024-10-24 18:16:38,959 - INFO - separator - PyTorch Version: 2.2.2+cu118
2024-10-24 18:16:38,961 - ERROR - separator - FFmpeg is not installed. Please install FFmpeg to use this package.
Traceback (most recent call last):
File "D:\BaiduNetdiskDownload\hallo2\scripts\inference_long.py", line 508, in
save_path = inference_process(command_line_args)
File "D:\BaiduNetdiskDownload\hallo2\scripts\inference_long.py", line 235, in inference_process
audio_processor = AudioProcessor(
File "D:\BaiduNetdiskDownload\hallo2\hallo\datasets\audio_processor.py", line 61, in __init__
self.audio_separator = Separator(
File "D:\BaiduNetdiskDownload\hallo2\jian27\lib\site-packages\audio_separator\separator\separator.py", line 140, in __init__
self.setup_accelerated_inferencing_device()
File "D:\BaiduNetdiskDownload\hallo2\jian27\lib\site-packages\audio_separator\separator\separator.py", line 147, in setup_accelerated_inferencing_device
self.check_ffmpeg_installed()
File "D:\BaiduNetdiskDownload\hallo2\jian27\lib\site-packages\audio_separator\separator\separator.py", line 173, in check_ffmpeg_installed
ffmpeg_version_output = subprocess.check_output(["ffmpeg", "-version"], text=True)
File "D:\BaiduNetdiskDownload\hallo2\jian27\lib\subprocess.py", line 421, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "D:\BaiduNetdiskDownload\hallo2\jian27\lib\subprocess.py", line 503, in run
with Popen(*popenargs, **kwargs) as process:
File "D:\BaiduNetdiskDownload\hallo2\jian27\lib\subprocess.py", line 971, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "D:\BaiduNetdiskDownload\hallo2\jian27\lib\subprocess.py", line 1456, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] 系统找不到指定的文件。
Press any key to continue . . .
这是啥情况啊
cuda版本不对
老大,这是什么情况
save path: ./output/1
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1'}, 'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1'}, 'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1'}, 'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1'}, 'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\glintr100.onnx recognition ['None', 3, 112, 112] 127.5 127.5
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1'}, 'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\scrfd_10g_bnkps.onnx detection [1, 3, '?', '?'] 127.5 128.0
set det-size: (640, 640)
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1729437915.446654 9128 face_landmarker_graph.cc:174] Sets FaceBlendshapesGraph acceleration to xnnpack by default.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
W0000 00:00:1729437915.454916 3604 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
W0000 00:00:1729437915.462255 4936 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
Processed and saved: ./output/1\1_sep_background.png
Processed and saved: ./output/1\1_sep_face.png
Some weights of Wav2VecModel were not initialized from the model checkpoint at ./pretrained_models/wav2vec/wav2vec2-base-960h and are newly initialized: ['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1', 'wav2vec2.masked_spec_embed']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
2024-10-20 23:25:15,746 - INFO - separator - Separator version 0.17.2 instantiating with output_dir: ./output/1\audio_preprocess, output_format: WAV
2024-10-20 23:25:15,747 - INFO - separator - Operating System: Windows 10.0.19045
2024-10-20 23:25:15,747 - INFO - separator - System: Windows Node: DESKTOP-07OKCTL Release: 10 Machine: AMD64 Proc: Intel64 Family 6 Model 151 Stepping 2, GenuineIntel
2024-10-20 23:25:15,747 - INFO - separator - Python Version: 3.10.15
2024-10-20 23:25:15,748 - INFO - separator - PyTorch Version: 2.2.2+cu118
2024-10-20 23:25:15,748 - ERROR - separator - FFmpeg is not installed. Please install FFmpeg to use this package.
Traceback (most recent call last):
File "H:\gongju\11\scripts\inference_long.py", line 508, in
save_path = inference_process(command_line_args)
File "H:\gongju\11\scripts\inference_long.py", line 235, in inference_process
audio_processor = AudioProcessor(
File "H:\gongju\11\hallo\datasets\audio_processor.py", line 61, in __init__
self.audio_separator = Separator(
File "H:\gongju\11\jian27\lib\site-packages\audio_separator\separator\separator.py", line 140, in __init__
self.setup_accelerated_inferencing_device()
File "H:\gongju\11\jian27\lib\site-packages\audio_separator\separator\separator.py", line 147, in setup_accelerated_inferencing_device
self.check_ffmpeg_installed()
File "H:\gongju\11\jian27\lib\site-packages\audio_separator\separator\separator.py", line 173, in check_ffmpeg_installed
ffmpeg_version_output = subprocess.check_output(["ffmpeg", "-version"], text=True)
File "H:\gongju\11\jian27\lib\subprocess.py", line 421, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "H:\gongju\11\jian27\lib\subprocess.py", line 503, in run
with Popen(*popenargs, **kwargs) as process:
File "H:\gongju\11\jian27\lib\subprocess.py", line 971, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "H:\gongju\11\jian27\lib\subprocess.py", line 1456, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] 系统找不到指定的文件。
Press any key to continue . . .
老大,这是什么情况?
这个要你安装下ffmpeg? 如果不会我可能是忘记把ffmpeg放在这个项目目录中了,你搜下本站下载后把ffmpeg.exe 放在目录下就好了
开源网站是哪个?
https://github.com/fudan-generative-vision/hallo2
这个怎么下载。点击网盘出现拒绝访问
再次测试 没有问题,看你的浏览器是不是有什么问题
最低运行要求是多少啊?
英伟达显卡 显存16g
支持4K,时长1小时,强得没边儿~!