I have images in pages that are links to image attachments in other pages:
<ac:image>
<ri:attachment ri:filename="home.jpg">
<ri:page ri:content-title="Let's edit this page (step 3 of 9)" ri:space-key="ds" />
</ri:attachment>
</ac:image>
I can obtain the linked page's space (ds) and title (Let's edit this page (step 3 of 9)) from the OutgoingLink object but not the filename.
How can I obtain the filename (home.jpg) so that I can access and download the linked attachment?
Community moderators have prevented the ability to post new answers.
The easy way but requires jsoup
Page p=<Load the page here>; final ConversionContext conversionContext = new DefaultConversionContext(page.toPageContext()); String rendered = renderer.render(post.getBodyAsString(),conversionContext); Document doc = Jsoup.parse(rendered); Elements images = doc.select("img"); for (Element el : images) { String imageUrl = el.attr("src"); //Do something with image url }
The harder way:
Pate p=<Load the page here>; Pattern p1=Pattern.compile("(<ac:image\\s*(?:[ac:height=\"[0-9]*\"]*)?>.*?</ac:image>)"); Matcher m=p1.matcher(p.getBodyAsString()); if(m.find()){ Document doc=loadXml(p.getBodyAsString()); do{ NodeList img = doc.getElementsByTagName("ac:image"); Element element=(Element)img.item(0); //According to probable scenarios (1) //Do like: //element.getAttribute("ri:filename"); //element.getAttribute("ri:space-key"); //element.getAttribute("ri:content-title"); //element.getAttribute("ri:value"); }while(m.find()); }
private Document loadXMLFromString(String xml) throws Exception { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); InputSource is = new InputSource(new StringReader(xml)); is.setEncoding("UTF-8"); return builder.parse(is); }
where (1) is https://confluence.atlassian.com/doc/confluence-storage-format-283640220.html
Hope it helps
Thanks for the information it was very useful to see a solution based on the page data rather than page related objects. I went for a combination of the two solutions that you provided: Document doc = Jsoup.parse(page.getBodyAsString(), "", Parser.xmlParser()); Elements images = doc.getElementsByTag("ac:image"); for (Element image : images) { Element attachment = image.getElementsByTag("ri:attachment").first(); if (attachment != null) { String fn = attachment.attr("ri:filename"); Element sourcePage = attachment.getElementsByTag("ri:page").first(); if (sourcePage != null) { String ct = sourcePage.attr("ri:content-title"); String sk = sourcePage.attr("ri:space-key"); } } }
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
I am glad it helped :)
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Panos, thanks for your prompt reply.
I'm working on a java plugin which is downloading the images displayed in a page.
I can access the images directly attached to the page using attachmentManager.getLatestVersionsOfAttachments(page);
However, some images are displayed using links to attachments contained on other pages and to access these I was using page.getOutgoingLinks();
The problem is that the OutgoingLink object provides the space and title information for the referred page, but not the attachment filename.
It's the filename that I need and I cannot see anyway of obtaining it other than using some RegEx on the page content.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
If i was you, i would follow the outgoing links and parse the ac:image macro. It happens to be working on parsing some similar situation, so if you decide to go that way i can help you further.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Thanks for the information, yes I would like to see how you are parsing the ac:image tag. I am using the XhtmlContent class but this does not see ac:image as a macro.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
This is relative format, so the home.jpg is attached to that page this macro is located.
Since you don't explain how you are obtaining - java? javascript? something else? - I enlist some of options:
1) Rest call to find the url:
Make a call to /rest/prototype/1/content/PAGEID_HERE/attachment, find the attachment with filename="home.jpg", extract field "link"->"href" from the json.
2) Use backend:
Page p=pageManager.getPage(PAGEID_HERE); Attachment attachment = attachmentManager.getAttachment(p, "home.jpg"); attachment.getUrlPath();
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.