如何使用Python創建人工智能驅動的WhatsApp貼紙生成器
譯者 | 李睿
審校 | 重樓
想象一下,不必再依賴網絡上的通用素材,就能發送完全由自己定制的表情包與卡通貼紙——那會是怎樣的體驗?使用OpenAI公司最新推出的GPT-Image-1模型,用戶可以將自己的自拍照或日常照片轉化為妙趣橫生或風格獨特的個性化貼紙。本文將介紹如何使用Python在Colab中構建WhatsApp貼紙生成器,它支持多種藝術風格處理,包括漫畫風格與皮克斯風格濾鏡等。
具體包括如何設置OpenAI圖像編輯API,在Colab中捕獲或上傳圖像,選擇預設趣味文本或輸入自定義內容,并利用多個API 密鑰并行生成三種不同風格的貼紙,大幅提升生成效率。最終,將創建一個基于GPT-Image-1和自定義文本提示驅動的貼紙制作工具。
為什么選擇GPT-Image-1?
本文基于??Leonardo.ai??平臺上,對 Gemini 2.0 Flash、Flux 及 Phoenix 等前沿圖像生成模型進行了評估。研究發現,這些模型在準確呈現文本與表情方面普遍存在困難。例如:
- Google的Gemini 2.0圖像API即便接收到明確指令,其生成結果仍頻繁出現拼寫錯誤或文字混亂。例如,輸入“Big Sale Today!”時,輸出可能是“Big Sale Todai”或隨機亂碼。
- 盡管Flux模型生成的圖像整體質量較高,但用戶普遍反映其“容易在渲染文本時引入細微錯誤”。隨著文本長度增加,拼寫錯誤或亂碼現象會愈發明顯。此外,該模型還傾向于生成高度相似的面部特征,除非施加嚴格約束,否則易出現“千人一面”的問題。
- Phoenix 模型雖然在圖像保真度和提示詞遵循方面進行了優化,但與多數擴散模型一樣,它仍將文本視為視覺元素而非語義內容進行處理,因此常出現文本生成錯誤。研究發現,Phoenix只能偶爾生成措辭正確的貼紙,并且對于相似提示會反復輸出相同的默認面部特征。
總之,現有模型的這些局限性促使OpenAI開發了GPT-Image-1。與上述模型不同,GPT-Image-1采用了專門調序的提示管道,能夠明確強制模型生成正確的文本和豐富的表情變化。
GPT-Image-1如何進行圖像編輯
GPT-Image-1是OpenAI推出的旗艦多模態模型。它可以從文本和圖像提示中創建和編輯圖像,生成并輸出高質量的圖像。其核心能力在于,可以依據文本指令對原始圖像進行指定編輯。在本文的案例中,通過調用GPT-Image-1的API,對輸入照片施加趣味幽默的濾鏡效果并疊加文字,從而生成個性化貼紙。
通過精心構建的提示詞,約束模型輸出符合貼紙規格(1024×1024PNG格式)的圖像。GPT-Image-1實際上成為了人工智能驅動的貼紙創建者,它既能智能改變照片主體的外觀,又能為其添加幽默文本,最終完成貼紙的自動化創作。
Python
# Set up OpenAI clients for each API key (to run parallel requests)
clients = [OpenAI(api_key=key) for key in API_KEYS]因此,為每個API密鑰分別創建了一個OpenAI客戶端。通過配置三個獨立密鑰,用戶即可實現三次API調用的同步執行。這種基于多密鑰與多線程的技術方案,依托ThreadPoolExecutor實現并行處理,使得每次運行都能同時生成三張貼紙。正如代碼輸出所顯示,系統正通過“3個API密鑰并行生成”的方式,顯著提升貼紙的創建速度。
分步指南
許多人可能認為創建自己的人工智能貼紙生成器是一項復雜的任務,但本指南將化繁為簡。首先從在Google Colab中配置開發環境開始,接著介紹API的使用方法、理解提示詞類別、驗證文本,學習如何生成不同風格的貼紙,最終實現并行生成多張貼紙。每個步驟均配有詳細的代碼示例和說明,幫助用戶輕松上手。現在開始編寫代碼。
在Colab中安裝和運行
合適的配置是成功生成貼紙的前提。本項目將使用PIL和rembg等Python庫進行基礎圖像處理,并通過google-genai庫在Colab實例中調用相關服務。第一步是在Colabnotebook中直接安裝這些必備依賴項。
Python
!pip install --upgrade google-genai pillow rembg
!pip install --upgrade onnxruntime
!pip install python-dotenvOpenAI集成和API密鑰
在安裝完成后,導入模塊并設置API密鑰。腳本為每個API密鑰創建一個OpenAI客戶端。這允許代碼在多個密鑰之間并行分發圖像編輯請求。然后,客戶端列表被貼紙生成函數使用。
Python
API_KEYS = [ # 3 API keys
"API KEY 1",
"API KEY 2",
"API KEY 3"
]
"""# Stickerverse
"""
import os
import random
import base64
import threading
from concurrent.futures import ThreadPoolExecutor, as_completed
from openai import OpenAI
from PIL import Image
from io import BytesIO
from rembg import remove
from google.colab import files
from IPython.display import display, Javascript
from google.colab.output import eval_js
import time
clients = [OpenAI(api_key=key) for key in API_KEYS]圖像上傳和拍攝(邏輯)
現在,下一步是調用攝像頭以拍攝照片或上傳圖像文件。capture_photo()使用注入Colab的JavaScript打開網絡攝像頭并返回捕獲的圖像。upload_image()使用Colab的文件上傳組件,并使用PIL庫對文件格式進行驗證。
Python
# Camera capture via JS
def capture_photo(filename='photo.jpg', quality=0.9):
js_code = """
async function takePhoto(quality) {
const div = document.createElement('div');
const video = document.createElement('video');
const btn = document.createElement('button');
btn.textContent = '?? Capture';
div.appendChild(video);
div.appendChild(btn);
document.body.appendChild(div);
const stream = await navigator.mediaDevices.getUserMedia({video: true});
video.srcObject = stream;
await video.play();
await new Promise(resolve => btn.onclick = resolve);
const canvas = document.createElement('canvas');
canvas.width = video.videoWidth;
canvas.height = video.videoHeight;
canvas.getContext('2d').drawImage(video, 0, 0);
stream.getTracks().forEach(track => track.stop());
div.remove();
return canvas.toDataURL('image/jpeg', quality);
}
"""
display(Javascript(js_code))
data = eval_js("takePhoto(%f)" % quality)
binary = base64.b64decode(data.split(',')[1])
with open(filename, 'wb') as f:
f.write(binary)
print(f"Saved: {filename}")
return filename
# Image upload function
def upload_image():
print("Please upload your image file...")
uploaded = files.upload()
if not uploaded:
print("No file uploaded!")
return None
filename = list(uploaded.keys())[0]
print(f"Uploaded: {filename}")
# Validate if it's an image
try:
img = Image.open(filename)
img.verify()
print(f"?? Image verified: {img.format} {img.size}")
return filename
except Exception as e:
print(f"Invalid image file: {str(e)}")
return None
# Interactive image source selection
def select_image_source():
print("Choose image source:")
print("1. Capture from camera")
print("2. Upload image file")
while True:
try:
choice = input("Select option (1-2): ").strip()
if choice == "1":
return "camera"
elif choice == "2":
return "upload"
else:
print("Invalid choice! Please enter 1 or 2.")
except KeyboardInterrupt:
print("\nGoodbye!")
return None輸出:

類別和短語示例
接下來將創建不同的短語類別,用于貼紙內容的生成。為此定義了一個包含多種主題的PHRASE_CATEGORIES字典,涵蓋企業宣傳、寶萊塢、好萊塢、托萊塢、體育賽事和網絡表情包等類別。當用戶選擇某一類別后,系統會從該類別中隨機選取三個不同的短語,并分別應用于三種貼紙樣式的生成。
Python
PHRASE_CATEGORIES = {
"corporate": [
"Another meeting? May the force be with you!",
"Monday blues activated!",
"This could have been an email, boss!"
],
"bollywood": [
"Mogambo khush hua!",
"Kitne aadmi the?",
"Picture abhi baaki hai mere dost!"
],
"memes": [
"Bhagwan bharose!",
"Main thak gaya hoon!",
"Beta tumse na ho payega!"
]
}類別和自定義文本
生成器內置了一個預設的短語類別字典。用戶既可從指定類別中隨機選取趣味短語,也可自由輸入個性化文本。系統同時提供了交互式選擇輔助功能,并包含一個簡易的文本長度校驗函數,用于確保自定義短語符合貼紙生成的規范要求。
Python
def select_category_or_custom():
print("\nChoose your sticker text option:")
print("1. Pick from phrase category (random selection)")
print("2. Enter my own custom phrase")
while True:
try:
choice = input("Choose option (1 or 2): ").strip()
if choice == "1":
return "category"
elif choice == "2":
return "custom"
else:
print("Invalid choice! Please enter 1 or 2.")
except KeyboardInterrupt:
print("\nGoodbye!")
return None
# NEW: Function to get custom phrase from user
def get_custom_phrase():
while True:
phrase = input("\nEnter your custom sticker text (2-50 characters): ").strip()
if len(phrase) < 2:
print("Too short! Please enter at least 2 characters.")
continue
elif len(phrase) > 50:
print("Too long! Please keep it under 50 characters.")
continue
else:
print(f"Custom phrase accepted: '{phrase}'")
return phrase對于自定義短語,在接受之前檢查輸入長度(2~50個字符)。
短語驗證和拼寫防護機制
Python
def validate_and_correct_spelling(text):
spelling_prompt = f"""
Please check the spelling and grammar of the following text and return ONLY the corrected version.
Do not add explanations, comments, or change the meaning.
Text to check: "{text}"
"""
response = clients[0].chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": spelling_prompt}],
max_tokens=100,
temperature=0.1
)
corrected_text = response.choices[0].message.content.strip()
return corrected_text現在將創建一個示例build_prompt函數,為代理設置一些基本級別的指令。另外需要注意,build_prompt()會調用拼寫驗證器,然后將校正后的文本嵌入到嚴格的模板提示中:
Python
# Concise Prompt Builder with Spelling Validation
def build_prompt(text, style_variant):
corrected_text = validate_and_correct_spelling(text)
base_prompt = f"""
Create a HIGH-QUALITY WhatsApp sticker in {style_variant} style.
OUTPUT:
- 1024x1024 transparent PNG with 8px white border
- Subject centered, balanced composition, sharp details
- Preserve original facial identity and proportions
- Match expression to sentiment of text: '{corrected_text}'
TEXT:
- Use EXACT text: '{corrected_text}' (no changes, no emojis)
- Bold comic font with black outline, high-contrast colors
- Place text in empty space (top/bottom), never covering the face
RULES:
- No hallucinated elements or decorative glyphs
- No cropping of head/face or text
- Maintain realistic but expressive look
- Ensure consistency across stickers
"""
return base_prompt.strip()風格變體:漫畫vs皮克斯
這三種風格模板存放于 STYLE_VARIANTS 中。前兩種為漫畫夸張化變形處理,第三種為皮克斯風格的3D外觀。這些字符串將直接傳入提示詞生成器中,并決定最終的視覺風格。
Python
STYLE_VARIANTS = [ "Transform into detailed caricature with slightly exaggerated facial features...", "Transform into expressive caricature with enhanced personality features...", "Transform into high-quality Pixar-style 3D animated character..."]并行生成貼紙
該項目的真正優勢在于貼紙的并行生成能力。系統通過多線程技術,使用三個獨立的API密鑰同時運行三個生成任務,從而顯著縮短了等待時間。
Python
# Generate single sticker using OpenAI GPT-image-1 with specific client (WITH TIMING)
def generate_single_sticker(input_path, output_path, text, style_variant, client_idx):
try:
start_time = time.time()
thread_id = threading.current_thread().name
print(f"[START] Thread-{thread_id}: API-{client_idx+1} generating {style_variant[:30]}... at {time.strftime('%H:%M:%S', time.localtime(start_time))}")
prompt = build_prompt(text, style_variant)
result = clients[client_idx].images.edit(
model="gpt-image-1",
image=[open(input_path, "rb")],
prompt=prompt,
# input_fidelity="high"
quality = 'medium'
)
image_base64 = result.data[0].b64_json
image_bytes = base64.b64decode(image_base64)
with open(output_path, "wb") as f:
f.write(image_bytes)
end_time = time.time()
duration = end_time - start_time
style_type = "Caricature" if "caricature" in style_variant.lower() else "Pixar"
print(f"[DONE] Thread-{thread_id}: {style_type} saved as {output_path} | Duration: {duration:.2f}s | Text: '{text[:30]}...'")
return True
except Exception as e:
print(f"[ERROR] API-{client_idx+1} failed: {str(e)}")
return False
# NEW: Create stickers with custom phrase (all 3 styles use the same custom text)
def create_custom_stickers_parallel(photo_file, custom_text):
print(f"\nCreating 3 stickers with your custom phrase: '{custom_text}'")
print(" ? Style 1: Caricature #1")
print(" ? Style 2: Caricature #2")
print(" ? Style 3: Pixar Animation")
# Map futures to their info
tasks_info = {}
with ThreadPoolExecutor(max_workers=3, thread_name_prefix="CustomSticker") as executor:
start_time = time.time()
print(f"\n[PARALLEL START] Submitting 3 API calls SIMULTANEOUSLY at {time.strftime('%H:%M:%S', time.localtime(start_time))}")
# Submit ALL tasks at once (non-blocking) - all using the same custom text
for idx, style_variant in enumerate(STYLE_VARIANTS):
output_name = f"custom_sticker_{idx+1}.png"
future = executor.submit(generate_single_sticker, photo_file, output_name, custom_text, style_variant, idx)
tasks_info[future] = {
'output_name': output_name,
'text': custom_text,
'style_variant': style_variant,
'client_idx': idx,
'submit_time': time.time()
}
print("All 3 API requests submitted! Processing as they complete...")
completed = 0
completion_times = []
# Process results as they complete
for future in as_completed(tasks_info.keys(), timeout=180):
try:
success = future.result()
task_info = tasks_info[future]
if success:
completed += 1
completion_time = time.time()
completion_times.append(completion_time)
duration = completion_time - task_info['submit_time']
style_type = "Caricature" if "caricature" in task_info['style_variant'].lower() else "Pixar"
print(f"[{completed}/3] {style_type} completed: {task_info['output_name']} "
f"(API-{task_info['client_idx']+1}, {duration:.1f}s)")
else:
print(f"Failed: {task_info['output_name']}")
except Exception as e:
task_info = tasks_info[future]
print(f"Error with {task_info['output_name']} (API-{task_info['client_idx']+1}): {str(e)}")
total_time = time.time() - start_time
print(f"\n [FINAL RESULT] {completed}/3 custom stickers completed in {total_time:.1f} seconds!")
# UPDATED: Create 3 stickers in PARALLEL (using as_completed)
def create_category_stickers_parallel(photo_file, category):
if category not in PHRASE_CATEGORIES:
print(f" Category '{category}' not found! Available: {list(PHRASE_CATEGORIES.keys())}")
return
# Choose 3 unique phrases for 3 stickers
chosen_phrases = random.sample(PHRASE_CATEGORIES[category], 3)
print(f" Selected phrases for {category.title()} category:")
for i, phrase in enumerate(chosen_phrases, 1):
style_type = "Caricature" if i <= 2 else "Pixar Animation"
print(f" {i}. [{style_type}] '{phrase}' → API Key {i}")
# Map futures to their info
tasks_info = {}
with ThreadPoolExecutor(max_workers=3, thread_name_prefix="StickerGen") as executor:
start_time = time.time()
print(f"\n [PARALLEL START] Submitting 3 API calls SIMULTANEOUSLY at {time.strftime('%H:%M:%S', time.localtime(start_time))}")
# Submit ALL tasks at once (non-blocking)
for idx, (style_variant, text) in enumerate(zip(STYLE_VARIANTS, chosen_phrases)):
output_name = f"{category}_sticker_{idx+1}.png"
future = executor.submit(generate_single_sticker, photo_file, output_name, text, style_variant, idx)
tasks_info[future] = {
'output_name': output_name,
'text': text,
'style_variant': style_variant,
'client_idx': idx,
'submit_time': time.time()
}
print("All 3 API requests submitted! Processing as they complete...")
print(" ? API Key 1 → Caricature #1")
print(" ? API Key 2 → Caricature #2")
print(" ? API Key 3 → Pixar Animation")
completed = 0
completion_times = []
# Process results as they complete (NOT in submission order)
for future in as_completed(tasks_info.keys(), timeout=180): # 3 minute total timeout
try:
success = future.result() # This only waits until ANY future completes
task_info = tasks_info[future]
if success:
completed += 1
completion_time = time.time()
completion_times.append(completion_time)
duration = completion_time - task_info['submit_time']
style_type = "Caricature" if "caricature" in task_info['style_variant'].lower() else "Pixar"
print(f"[{completed}/3] {style_type} completed: {task_info['output_name']} "
f"(API-{task_info['client_idx']+1}, {duration:.1f}s) - '{task_info['text'][:30]}...'")
else:
print(f"Failed: {task_info['output_name']}")
except Exception as e:
task_info = tasks_info[future]
print(f"Error with {task_info['output_name']} (API-{task_info['client_idx']+1}): {str(e)}")
total_time = time.time() - start_time
print(f"\n[FINAL RESULT] {completed}/3 stickers completed in {total_time:.1f} seconds!")
if len(completion_times) > 1:
fastest_completion = min(completion_times) - start_time
print(f"Parallel efficiency: Fastest completion in {fastest_completion:.1f}s")在這里,generate_single_sticker() 函數負責構建提示詞并調用圖像編輯接口,其參數 client_idx 用于指定特定的API客戶端。并行處理層通過創建最大工作線程數為3的 ThreadPoolExecutor 線程池,同步提交全部生成任務,并借助 as_completed 方法對完成結果進行實時收集與處理。這一機制確保了每個貼紙生成完成后均可被腳本立即記錄。
系統日志將完整追蹤各線程的執行狀態,詳細記錄任務耗時及所應用的風格類型(夸張漫畫或皮克斯風格),為運行狀態監控與效果分析提供全面支持。
主執行塊
在腳本的底部,__main__保護塊默認運行sticker_from_camera()。不過,可以根據需要注釋或取消注釋相關代碼,以運行interactive_menu()、create_all_category_stickers()或其他函數。
Python
# Main execution
if __name__ == "__main__":
sticker_from_camera()輸出視頻:

??https://cdn.analyticsvidhya.com/wp-content/uploads/2025/10/final_op_stickerverse_1.mp4??
輸出圖像:

有關這個WhatsApp貼紙生成器代碼的完整版本,可以訪問這個??GitHub存儲庫??。
結論
本文詳細介紹了如何配置GPT-Image-1調用、構建貼紙生成提示、通過拍攝或上傳獲取圖像、選擇預設趣味短語或輸入自定義文本,并實現三種風格變體的并行生成。整個項目僅用數百行代碼,即可將普通照片轉化為漫畫風格的個性化貼紙。
通過將OpenAI視覺模型與創意提示工程及多線程技術相結合,用戶能在數秒內生成趣味十足的個性化貼紙。最終構建的人工智能驅動型WhatsApp貼紙生成器,支持一鍵生成貼紙并即時分享至所有好友與群組。現在就用你的精彩照片和最愛的幽默短語,開啟專屬貼紙創作之旅吧!
常見問題解答
Q1.人工智能驅動的WhatsApp貼紙生成器有什么功能?
A.:它利用OpenAI的GPT-Image-1模型,將用戶上傳或拍攝的照片轉換成有趣且風格化的WhatsApp貼紙,并可添加文字。
Q2.為什么GPT-Image-1比其他圖像模型更優秀?
A:GPT-Image-1在處理文本準確性和面部表情方面優于Gemini、Flux或Phoenix等模型,確保貼紙上的文字準確無誤并且視覺效果富有表現力。
Q3.腳本如何加快貼紙生成速度?
A:它使用三個OpenAI API密鑰和一個ThreadPoolExecutor來并行生成三個貼紙,從而縮短了處理時間。
原文標題:Create an AI-Powered WhatsApp Sticker Generator using Python
作者:Vipin Vashisth文章鏈接:??https://www.analyticsvidhya.com/blog/2025/10/whatsapp-sticker-generator/??

















