AI-Driven Multimodal NFT Generation Pipeline

SolSky introduces a streamlined, AI-powered pipeline that enables the creation of NFTs through multiple input modes—text, images, and traits. This multimodal generation system is divided into four core stages, each enhancing creator flexibility and automation in the NFT creation process.

Text-to-2D Image Generation

Users can simply enter descriptive prompts (e.g., “fluffy white cat”) to automatically generate original 2D artworks.

📌 Visual Example: "Text to 2D" → Cat illustration

  • Natural language input
  • Supports various artistic styles (cartoon, pixel, realism, etc.)
  • Works for both individual artworks and bulk NFT collections
  • Text-to-3D Model Generation

    This step allows creators to bypass 2D entirely by generating 3D models directly from descriptive text.

    📌 Visual Example: "Text to 3D" → Rose becomes a 3D object

  • Text input results in a fully rendered 3D model
  • Ideal for metaverse assets, avatars, and interactive collectibles
  • No prior 3D modeling experience required
  • 2D Image Merging

    SolSky enables smart layering and merging of 2D image components—like backgrounds, characters, and traits—into unique composite artworks.

    📌 Visual Example: Background + flower merged together

  • Supports trait-based combinations with rarity logic
  • Automatic layout optimization
  • Metadata generation for NFTs included
  • 2D-to-3D Upload & Conversion

    Users can upload 2D images and transform them into 3D models with AI assistance or manually upload existing 3D files.

    📌 Visual Example: 2D dog becomes a 3D dog

  • AI-assisted 2D-to-3D conversion
  • Real-time previews and customization options
  • Exportable to engines like Unity, Unreal, and compatible with WebXR